Friday, May 15, 2009

wolfram alpha + taxonomy = ?

So whenever a new search engine is introduced I like to test it with the family of wasps I specialize on, the Diapriidae. Wolfram|alpha is getting a lot of hype lately, and tonight it went live. While it largely choked and acted oddly (if it worked at all) on my search terms, some searches did appear to complete. I tested the term Diapriidae, needless to say it didn't do too well as parasitic wasps are not dinosaurs. I suppose it gets some points for recognizing I was requesting something related to a taxonomic classification...maybe. "Hymenoptera" was more successful, but with (very) minimal results. "total species of hymenoptera" could not be interpreted. Searching for taxonomy returns some interesting results, including "taxonomic networks" (e.g.). You can click to see the sources for a given result, and those sources are (nicely) linked, Species 2000/ITIS is listed as one. It doesn't appear that you can drill down much past orders of insects. In general I'm very underwhelmed thus far. The primary source is listed as Wolfram|Alpha curated data, it will be interesting to see if they expand there database, watch for job postings cybertaxonomists!

Sunday, January 11, 2009

augmented reality



This technology is pretty cool, and seems to have been around for a while (see also this more recent application). I'm not sure how or what it has to do with cybertaxonomy, but it seems like it should have some application. Perhaps the technology could be used by including the "fiducial marker" along with barcodes (paper) that are attached to mounted specimens. By waving a device over a drawer of specimens one would get images (magnifications), meta-data, or some other cool or useful information. Maybe you could build a physical tree, with something like meccano or lego pieces "enhanced" with these markers, the relationships among the physical branches could be interpreted in the augmented reality, perhaps mapping character state transitions onto the physical tree, overlaying geographic distributions, or some such. This could also be a great way to get kids into museums. First, at sites (schools?) away from the museum, hand out "game" cards each with some information on an organism and an aforementioned marker. These cards could be brought to a museum that housed an augmentation system. There once "invisible" information would be revealed, for example movies, 3d representations, or pointers to where real live versions in the museum could be found.

Sunday, September 21, 2008

ping (visualization)

Popping up to note that from Rod Page's post on the recent Nascent meeting it appears that visualization is a hot topic. If you aren't following Moritz Stefaner's well formed data blog, and you are interested in visualization, then you are really missing out. His latest post brings up all sorts of possibilities for phylogenetic or taxonomic implementations. For kids or students, why not print up a collectible card game <cough>was a long time ago</cough> with a biodiversity theme, embed interesting data in the cards, and marvel as phylogenetic relationships (or ecological etc.) appear when the cards are placed on that ridiculously cool table.

Wednesday, April 30, 2008

Relationalizing Nexus files with Ruby and mx

Ok, so "relationalizing" isn't really a word, but I kind of like how it sounds. For the past couple of weeks I've been writing a Mesquite (i.e. Nexus) file parser in Ruby. It uses the same basic lexer/parser engine that reads Newick formatted trees that I mentioned in a previous post to create a Ruby Nexus file object, with all the good bits (well most of them, some blocks are not parsed yet, but that's just a matter of extending the parser) easily accessible from the object. With this Nexus file object it was relatively trivial to write a conversion to mx (see fig.), i.e. a fully relational format.

The Ruby file parsing code is currently a plugin/library in the mx source, it can be easily extracted for use in other projects. Look for the code in mx 0.2.1540 and onwards when it makes it to Sourceforge, or contact me directly if you're really keen to get your hands on it.

Tuesday, April 1, 2008

Plasto-types?

Slashdot and the BBC have picked up a story on the use of X-ray Radiography to look at fossil insects. I've seen some pre-press examples of the same technology on extant bugs, the results are incredible, in part because all the soft tissues are left intact and you can get any cross section you want. I'm really curious as to 1) the the resolution on the fossils, on extant critters it is apparently very close to SEM calibre; and 2) the cost. This could be a huge boon to making fossil data available to phylogenetic studies. The kicker at the end- the researchers suggest that the printed plastic insect could be designated as a type specimen. In many ways this would have huge advantages, as anybody with the technology could print their own types. The real suggestion underlying this is not that the plastic itself is the type, but rather that a set of 1s and 0s can be typified, the plastic of course being generated from digital data. This of course opens up a whole can of (prehistoric?) worms. How many 1s an 0s are needed before a (meaningful) type can be designated? Can I use a CoolPix for imaging and maybe bundle my images with some 1s and 0s that encode for some specimen data, and typify the resulting zip file? From a pragmatic standpoint- why not?

Friday, February 29, 2008

barcoding (dna) Google tech talk

Haven't seen this pointed at yet, a little state of the union from Hebert and Janzen that was posted on Google's tech talks. "It [barcoding] works with startling clarity." ;). A number of interesting insights during the question period, with a general focus on scale, diminishing returns, and legal issues (curious that, coming from Google and all).

Sunday, January 27, 2008

misc.

Annotating images with user-created overlays is a must for defining morphological characters, or enhancing ontologies. This seems to be a relatively tricky thing to do over the web. Inputdraw is a SWF widget that is free for noncommercial use. It allows you to save overlays drawn on images into forms, the data are then saved as SVG text. The OpenCollections software also mentions an annotation system that is in the works, though the bottom line there is that its not quite ready prime time. At a recent Morphbank meeting someone (apologies for not remembering who) mentioned that another solution might involve using the Google Maps API to create polylines or points on custom "maps", which would in fact be your images. The Beginning Google Maps Applications with Rails and Ajax book is a decent starting point for implementing this approach. Note that you would need .gif or .png formatted images if you attempt this. I used the Beginning Google Maps book to finally implement maps "natively" within mx (we still have hooks to BerkeleyMapper, which is a great service), but not without several hours of frustration. While the book is quite clearly written the code provided in Chapter 3 is incomplete or erroneous in several very frustrating ways, so make sure to download the updated code from the website if you have the book.

As I add new tables to mx I'm starting to add fewer and fewer columns, with the idea that tagging can be used as the primary means of extending the basic objects. Tagging essentially allows you to extend your records to as many fields as you want, and is therefor a very simple way to provide extensibility. A new book on the tagging phenomenon by Gene Smith looks to be a must read, I think I'll order mine now.

Finally, while somewhat older news, I keep thinking about social annotation with respect to taxonomy, and also things like scoring phylogenetic characters. How might things discussed in this Google talk from Luis von Ahn be applied to systematics research? Within taxonomy one approach could be to simply photograph many specimens and then allow the general public to point out the similarities and difference. These could then be vetted by the "experts" as a starting point. This approach, albeit with a greatly simplified "taxonomy" and character set, is being used at GalaxyZoo. In the GalaxyZoo example the galaxies have already been classified by computer algorithms into various types, so there is an excellent comparative dataset for testing things like the trustworthiness of the public contributions. Games like those discussed by von Ahn could also be used in developing hypotheses of character homology. In systematics we present homology hypotheses that are then further tested using phylogenetic analysis. What if part of this testing requires that the definition of these hypotheses be agreed upon by two or more experts, using games like those discussed by von Ahn? In theory this agreement is already required, to some degree, as implemented in the review process that occurs prior to publication. It could, however, be made more explicit (and fun?). Given the right framework for playing these games there would be many beneficial spin offs including obvious things like annotated ontologies of morphological characters.