Thursday, July 26, 2007

to READ

http://code.google.com/edu/content/submissions/uwspr2007_clustercourse/listing.html

Monday, July 09, 2007

Another Idea - Speech recognition

2-3 years ago I had looked up some work on speech to text conversion.
The approaches were based on Neural networks, and I did not find them helpful.

I dont know the as it is now, but here is the Idea that sprung to my mind yesterday.

Saturday, July 07, 2007

Towards Semantic web. One step at a time

I was traveling back from my hometown to bangalore when I came across this good science fiction article which revolves around semantic web and embedded devices.
Just reinforcing my vision of a world with information at the time we want and at the place we need. Basically it is the availability of information when needed.

I had been reading a few papers on Information retrieval, the semantic web [collaborative filtering/ social networks]. Few of these papers have fascinated me and I believe that they are going to make the world a easier place to live in.

Here is my iota of contribution towards this end:
This is a continuation of my efforts started here.

I implemented the concepts mentioned in the paper above to come up with a program capable to developing a concept map for a small document corpus.

Here is the concept map:


I used the Stanford NLP parser to extract the nounphrases from the document.
JGraphT to represent the graph (I used the DirectedGraph) and JGraph to display the graph.

Lots of improvements to be done:
  1. At present all the document processing and generation of the concept map is done online. That sucks up a lot of memory. Hence I'll have to use persistent storage store the nounPhrase - Document association, the Document information, the nounPhrase - nounPhrase association. I am planning to try hsqldb for this. It supports inmemory databases.
  2. Need to speed up the execution by having some parallel processing. I will try getting help from a professor at IISc [SERC] for this.
  3. I need to add the relationship between two entities (alongwith their weightage). For example: Google has a relationship with yahoo. Right now my program just tells that there is an association. but what type of association [I can be competitors/ successful startups/ young CEO's/ web companies/ great places to work at/etc etc]
  4. Optimise the code.
And I want to learn developing applications for embedded devices too.
I will start with image processing in mobile phones. I bought a second hand nokia 6600 for this.
The two thing I have in mind for mobile applications are:
  • An image processing program that will tell me the destination of the bus when I click the photograph of the destination board of the bus (which is in Kannada) so that I dont have to rush to the bus to ask "Bhaiya, majestic jayega?"
  • An image processing program (again) which converts my notes (which are quite random) into a good power point presentation (It should at least capture the shapes and the content)
Hmm.. seems that I talk a lot. Let me get back to work!