…and the artifacts (legacy and first maven-based release) are on Maven Central:
Still impressed by the quick resolution of requests over at sonatype.
Tagged with: project, release
Libraries that are essential to ones own projects, but no longer maintained, can be annoying. Today was just another occasion, where I forked an existing library and upgraded/fixed it. This time, the library jcamp-dx is for reading spectral data files.
The original project is hosted on sourceforge.net:
The new home of the project (and also mavenized) is here:
As soon as the project has been approved for Maven Central, I’ll be deploying legacy and new artifacts.
Tagged with: library, project
Tagged with: general, plugin
Just made a new maintenance release available for the collective-classification project: it now works with Weka 3.7.11. You can download the Weka package from here:
Tagged with: package, project, weka
Earlier this year, I got invited to give a talk at the Plant Protection Society Symposium titled The plant protection data toolbox: On beyond t, F and χ, thanks to my expertise in data mining.
Despite me drawing the short straw in getting the last slot for the day – right before drinks and nibbles – my talk was well received. I did various analyses of an aphid-related dataset and also showed briefly a project that I’m working on with Cropwatch BV in regards to insect classification – all using ADAMS/WEKA, of course. Generated a few interesting conversations after my talk, which was really great.
The only downside was getting up at 5am and getting back home by 10pm… But well worth it!
Tagged with: adams, conference, weka
I always wanted to be able to visualize large confusion matrices as a heatmap. Making it easier to visualize where misclassifications hot spots are. Hence I started another plugin project for the Weka Explorer
It offers, at the moment, the following visualizations:
- text – slightly enhanced default text representation, can be saved as text file or printed
- table – representing the matrix in a JTable, can be saved as CSV file or printed
- heatmap – counts in the matrix get represented using colors chosen from a gradient generated from two colors, can be saved as image file
Here is an example of the heatmap visualization, using the matrix generated by J48 on the UCI dataset optdigits:
Tagged with: gpl3, plugin, weka
On the weekend, I was working on a paper and wanted to have nice diagrams of J48 trees. However, the default visualization in Weka is anything but great looking. Due to lack of Java libraries, I hacked together a little plugin for the Explorer that allows you to use the GraphViz executable dot to generate and display an image:
You can install a release using Weka’s package manager.
Tagged with: gpl3, plugin, weka
Stopwords support in Weka have always been a bit poor, to say the least. Initially, there was only a hard coded list, based on the Rainbow tool. However, simply having stopwords for the English language was a bit limited. Being able to supply your own list of stopwords in the StringToWordVector filter made the whole thing already a bit more flexible. But, you still couldn’t supply your own stopwords algorithm. Yesterday, I sat down and implemented a new class hierarchy centered around the weka.core.stopwords.StopwordsHandler interface. I added the following algorithms:
- Null – never flags a word as stopword
- Rainbow – previous hard coded list of stopwords
- WordsFromList – loads words from a file
- RegExpFromList – applies regular expressions loaded from a file
- MultiStopwords – applies multiple stopwords algorithms sequentially
Eibe reworked the StringToWordVector filter today to make use of the new class hierarchy.
Tagged with: project, weka
Today, I released a new version of the python-weka-wrapper library: 0.1.8.
No new functionality, apart from being able to create Instance objects using Python lists as well rather than just Numpy arrays, just bugfixes: the scatterplot for datasets and the installer were broken.
Tagged with: project, python, release, weka
Last year, while working on a consulting project, I had to export lots of screenshots from ADAMS. I got so annoyed at constantly having to click through my directory hierarchy, that I implemented a little accessory component for the JFileChooser in Swing, allowing me to define bookmarks. Man, that made it so much easier all of a sudden!
The component is modeled after the bookmarks from the file chooser that Gnome users have been familiar with for many years. Here is a screenshot:
You can find the project homepage here:
And on Maven Central: