In this blog I will discuss a simple toolkit you may use to create a vector space model (VSM) (Salton 75). The toolkit is called, Map/Reduce Text Mining Toolkit (MRTMT), however, for now, its accomplishments does not entirely cover the scope of text mining and just merely creating a VSM from text documents. The purpose […]
Last time I analyzed Mahout’s collaborative filtering algorithm. In this blog, I will be writing about computing the canonical Pearson correlation between two variables for a set of data using Hadoop’s M/R paradigm. If you have already written your own M/R tasks for Jobs, this tutorial is not for you. If you are just starting […]
I have been studying the Mahout v0.4 API. This API is a machine learning API and may be used on top of Hadoop. It is built in Java. In particular, I have been digging into the clustering and collaborative filtering code of Mahout. In this blog, I will not say much, since most of what […]
Introduction In this blog, I will demonstrate a way of displaying an End User License Agreement (EULA) in a Windows Phone 7 (WP7) application (app). Why is a blog like this one necessary? To be honest, showing an EULA is not as easy as it seems. Here are some problems that I have encountered. 1. […]
Introduction In this blog, I will talk about, iText, a Java API used in creating/manipulating PDFs and, jFreeChart, a Java API used in creating charts/graphs. I will make some suggestions based on my experience with iText and jFreeChart on how to use AND not use these APIs for generating PDFs with charts/graphs. I really wanted […]
In this blog entry, I will show with a few lines of code how to create an in-memory PDF report and send it as an email attachment. Why is this exercise or illustration important? It is important to me because I’m involved in a lot of report generation projects where the reporting logic and display […]
In a previous article, I detailed the anatomy of a Google web search result page. In this article, I will talk about an open source Java API (developed by me) to use Google’s web search without an API key. The open source license used is the Apache 2.0 License.