Friday, July 31, 2009

Significant Milepost

I have a functional module, called bpp, that incorporates all five libraries from Bio++. I've played with it in Python, and it seems to all work fine.

Later this evening and this weekend, I start playing with other languages, and testing.

So Close...

Got PhylLib working. (That is, it runs within Python with no obvious errors.)

Turned my attention to PopGenLib. I got it all compiling, except for one object: CoordsTools. That has mysterious compiling errors that are annoying me.

Here's what I know about the object:

  • It's defined within CoordsTools.h, and there's no CoordsTools.cpp.
  • No other file #includes it.
  • Because of these two facts, it doesn't actually get compiled with anything, when one compiles PopGenLib. Accessing the object would require one to specifically #include CoordsTools.h.
  • It's pretty much a static object--the only function of import is one to calculate the distance between two different "coordinates" that are of templated type.
  • It hasn't been modified in just over 5 years.
  • All the errors I'm having seem to be errors within the .h file itself. There are no issues I can see with the wrapper file.
  • Bio++ documentation doesn't even list it as a proper member of the bpp namespace.

Given all of this, I'm guessing it's an orphaned file. It may be that it didn't get included with some of the project's recent bug fixes, to allow the files to be compiled under current versions of gcc (or to be part of the bpp namespace). In any case, I'm seriously inclined to ignore the stupid thing. We're talking about one function, that's not called anywhere else.

So, with this exception...I think I'm almost done with writing these .i files. I need to put them all together into a single python object, but that should be easy to do. I'll do it in the morning, when my head is clear.

After that...I start the serious bug testing.

Thursday, July 30, 2009

Bug in Bio++

I'm pretty sure that that's the reason for the failure to compile. There are two overloaded functions in TreeTemplateTools--one is called searchNodeWithId, and the other is searchNodeWithName. They're both pretty similar. Each one has two versions. One returns a list of Nodes, the other returns void and has the list of Nodes as one of the arguments, returning the value by reference.  Except...the argument-returned searchNodeWithName calls the return-value-returned searchNodeWithId, and it does it using the name as an argument. I believe this was just cut-and-pasted from the argument-returned searchNodeWithId.

The code is in TreeTemplateTools.h. If I modify it using the above assumption (that one version of searchNodeWithName should call the other version of the same), everything compiles wonderfully. However, I'm reluctant to call a bug report into the project, since I've only been looking through this code for a few months. I sent a query to the mailing list, asking if this is indeed a bug. Hopefully someone will get back to me.

Anyway, I managed to get everything in PhylLib to compile. There are warnings, but I believe that they're fine. A lot of the functions call one particular function, that throws exceptions. And it throws a more general exception before a more specific one. But I looked through the code, and there seem to be good reasons for doing it this way. However, it does mean that the wrapper C++ files complain vociferously, since this appears to be the "wrong way around." I believe it's fine.

Last step with PhylLib is to make sure that it runs within Python. That should be straight-forward. Basically, I just need to make sure that every function binding that should be defined, is. This means that I'll have to slightly modify the interface files for some of the static functions (basically, anything called *Tools.i), but it should be pretty easy. Then, on to PopGenLib. Hopefully that'll go much more quickly, since it has 1/3 the number of .h files (and therefore SWIG .i files).

Wednesday, July 29, 2009

58 Files Compile

Of the 103 wrapper files generated by SWIG, I've managed to get 58 to compile. The rest rely on Tree objects, with which I've had trouble. One issue--one of the objects (TreeTools) has a function that returns a structure whose definition is nested within the TreeTools class. SWIG doesn't support this, though it's on the list to be supported in a future version. I think the best thing to do there is to just remove that structure and the one function, in wait for the future version.

Otherwise, I'm having issues with templating. I have one templated function that calls another in the file TreeTemplateTools. And for some reason, SWIG isn't able to find the inner function. I've posted to both the SWIG and GSoC mailing lists, and hopefully I'll get an answer.

Pjotr also asked me to move the SWIG interface files to a biopp subdirectory, to mimic the directory structure of the other projects. This was easily done.

Tuesday, July 28, 2009

PhylLib Interface Files

103 of them, to be exact. Automated as best I could, but of course I had to modify every one of them by hand. However, they all run under SWIG without error (except for possible memory leaks for a few of them, but that's probably unavoidable). Next step is to compile all the generated C++ files, but that shouldn't take as long. PhylLib should be all SWIGgy by tomorrow evening.

Monday, July 27, 2009

SeqLib Done

I'm getting better and better at this. The SeqLib library has been SWIGified. I'm running it right now in Python, without any errors that I've been able to find.

As part of getting this done, I had to fix the Matrix.i problem within the NumCalc library. It was convoluted, but basically I had to explicitly define a SWIG %rename for a vector to a vector.

Of course, there is one small change in the Python version of SeqLib that needs to be mentioned. Python doesn't support operator overloading in which the overloaded function isn't a part of the class--and there were two functions like that within the Site object of the SeqLib library. After playing with them for a while, I finally decided to just not include them.

So...this coming week I try to do the final two libraries: PhylLib and PopGenLib. Since both of them are smaller than SeqLib, I hope to have them done pretty quickly.

Saturday, July 25, 2009

I'm Back

Back home yesterday afternoon. Slept a lot, but now I'm back in action. Watching myself much more carefully this time, to make sure that jet lag doesn't turn into an infection, again.

I managed to work for a couple of hours each day I was in Japan, trying to finish up numcalc. Got almost all the interface files working just fine. Two exceptions. First, LUDecomposition.i doesn't work. SWIG seems to run fine, but then it gives me some scoping errors when I try to compile the resulting wrapper C++ files. Tried several variants, but I couldn't get it working. It's a minor issue though--it doesn't seem to be necessary for any of the other wrappers. Second exception is Matrix.i, which I'll touch on in in relation to the next problem.

Got all of the interface files for seqlib put together. Running through them now, though a couple still have bugs to work out. Some of it is leading back to Matrix.i inside of numcalc, that I never quite got working. That is, it all compiles, but I'm having trouble wrapping the templates. The RowMatrix object inherits from a vector of vectors. Really not pretty at all, and I'm worried the SWIG can't handle it. This problem's going to need to go to the SWIG mailing list, I'm afraid.

Saturday, July 18, 2009

Special Japan Edition

I'm writing from Kawaguchiko, at the base of Mt. Fuji. Work officially demanded my attention today...with a vengence.

What I've been trying to do with SWIG might be outright impossible. It has severe problems when trying to inherit from a standard, templatable object.

Unlike other SWIG directives which should be defined prior to reading in aheader file, you need to put %template after the *template* class isdeclared but *before* that template class used. Which creates a problem ifthe template class is created in the same header file where it is used (andyou don't own the header file). I have submitted a bug/enhancement requeston this:

http://sourceforge.net/tracker/?func=detail&aid=2502006&group_id=1645&atid=101645

This is still a big issue for me given that I'm wrapping a few thousandclasses of which a number of them have this problem. I think thatultimately I'll need to familiarize myself with the SWIG source code and fixthis myself (unless other SWIG developer with free time wants to pick thisup :)).

Thanks,
-James

The workaround might be making my own vector .i file. This is not exactly pleasing, but it seems it might be the only way to do this.

Otherwise, just spent a few hours ironing out the wrinkles in NumCalc. Managed to get some more .i files to compile with SWIG. But the big problem is still ParameterList, which a ton of things use, and which inherits from vector. This is not going to be pretty.

Tuesday, July 14, 2009

NumCalc

Started the annoying process of going through all the automatically generated .i files, and modifying them so that they run with SWIG without complaint.  It mostly went pretty well, except for the ParameterList object.  That one is called by many others, and unfortunately it doesn't work.  The problem is that it inherits from vector, which seems to confuse SWIG.  It doesn't matter how I try to instantiate things specifically--it always tells me that I forgot to instantiate vector.  I'm afraid that without this, nothing else will work.  I've posted to both the SWIG mailing list and the NESCent SoC mailing lists, but so far nothing.

Flight out for my real summer vacation is in 8 hours. Hopefully, I'll get a bit of work done over this time.

Monday, July 13, 2009

I'm Back

Managed to put in a full workday today, at last. (I've been working in fits and starts over the last few days, but also fighting off a bacterial infection in my lungs. ISMB travel -> jet lag -> contracted a cold & had a depressed immune system -> my new prokaryotic friends decided to colonize me. Fuckers.)

Anyway...here's what's up. I have a package for Python called bpp. bpp contains everything in Utils (minus a couple of objects such as BppString, which the user doesn't need, anyway). As near as I can tell, it works. Haven't done extensive unit testing, but what little I have played with all functions beautifully.

So I started working on NumCalc. Found that there were many more header files I needed to convert to SWIG .i files. And here's where things began to get interesting. SWIG needs these .i interface files in order to define the hooks into the underlying C++ libraries. Basically, these are just the header files with a few minor modifications. Well, they would be except that for some reason the biopp .h files contain an awful lot of source code. I don't know why--maybe there's a good reason that Julien et al. did that. But from my perspective, it means a lot more work on my part, trying to edit everything down so I just have the prototypes of the functions. Seriously...there are thousands of lines of actual algorithmic source code in the header files. I have never seen this be a good idea.

Anyway, today's big accomplishment was to automate a lot of the procedure. Thanks to that, I have every .i file for NumCalc completed, though not tested.

Wednesday, July 8, 2009

Sick...

Fighting off a cold. The timeline's about right to put this as something I picked up on the plane back. Hopefully I'll be more able to concentrate tomorrow.

Did manage to get some work done, during my moments of lucidity. I now have a python module called bpp, that includes every object of utils except BppString, BppVector, KeyvalTooks, and ApplicationTools. The first two I think are unnecessary anyway. The second two are having some linking problems that I'm trying to put together. They compile just fine, it's just that when you try to import them, the thing complains.

Monday, July 6, 2009

I'm Back

Back from Stockholm, which was well worth the trip. Got to discuss actual research, and finally got to meet Pjotr and Hillmar face-to-face.

This Sunday evening I spent trying to get every object within utils into a single library, which I'm calling bpp.  This will hopefully be a single element to run/module to include/whatever.  Right now though, there are four .i files that I'm having a hard time compiling:  ApplicationTools.i, BppString.i, Font.i, and FontManager.i.  Except for those guys, Utils might even be almost complete!