Sunday, September 29, 2013

Week 9/16/13

Hello everyone!

We have been working with rapid minor. We started of processing different types of data like large text files and excel worksheets. We also calculated similarity measurements of each document.

Most likely, we will examine Firefox, Eclipse, and Open Office bugs. These are found on the site Bugzilla. There is not an automated way to get the bug data from a certain time period of this site. We are currently looking for an efficient method. 

We also looked at a few models that use clustering of data to predict duplicates automatically. This type of model shows promise. It can be scaled to handle the massive amount of bug reports in a system. It is also efficient enough for real time bug detection. 


No comments:

Post a Comment