Sunday, September 29, 2013

Week 9/16/13

Hello everyone!

We have been working with rapid minor. We started of processing different types of data like large text files and excel worksheets. We also calculated similarity measurements of each document.

Most likely, we will examine Firefox, Eclipse, and Open Office bugs. These are found on the site Bugzilla. There is not an automated way to get the bug data from a certain time period of this site. We are currently looking for an efficient method. 

We also looked at a few models that use clustering of data to predict duplicates automatically. This type of model shows promise. It can be scaled to handle the massive amount of bug reports in a system. It is also efficient enough for real time bug detection. 


Week 9/2/13 and 9/9/13

I hope everyone had a nice labor day!

We have been very busy. We looked into various methods of calculating text similarity. Some of these included Cosine, Dice, and Jaccard measures.

 Other researchers have used execution data from the bug reports in the past.  We did research on possibly creating this execution data for each error and comparing the similarity of the commands used along side of the natural language data.

We also started to use rapid minor to parse and process text and html documents.

Wednesday, September 4, 2013

Starting up 2013!

Here is the first blog for the fall 2013 CREU project at Youngstown State University. So far, we have been busy doing preliminary research. We may apply an interesting mathematical concept of similarity indices to the practical application of  duplicate bug detecting in software development.
We are also testing out some data mining software to run experiments on. This year looks very promising, and we are very excited to get started!