Computer Science Research

Computer Science Research

Computer Science Research

 

CS Research Profiles

The College of Science and Mathematics has compiled a list of profiles of all the research done by our faculty.  To access it,  

Click Here

 

The faculty conduct research in such areas as algorithms and data structures, compiler design, software engineering, artificial intelligence, pattern recognition, computer graphics, computer science education, simulation and visualization, and data base theory.  The vast research ranges from machine learning to open source textbooks.  There are several years of research to read through and learn from.  Here are some abstracts of recent CS research done by the faculty:

Figure 2

 

Accuracy of class prediction using similarity functions in PAM

By: Vasil Hnatyshin

Abstract:

Clustering have been proven to be an effective technique for finding data instances with similar characteristics. Such algorithms are based on the notion of distance between data points, often computed using Euclidean metric. That is why, clustering algorithms are mostly applicable to the data sets comprising of numerical values. However, the real life data often consist of features which are categorical in nature. For example, to identify abnormal behavior or a cyberattack in a network, we usually examine packet headers which contain categorical values such as source and destination IP addresses, source and destination port numbers, upper layer protocols, etc. Euclidean metric is not applicable to such data sets because it cannot compute the distance between categorical variables. To address this problem, similarity functions have been designed to determine the relationship between given categorical values. Similarity defines how closely related the objects are to one another. Often similarity could be thought of as opposite to distance where similar objects have high value, while dissimilar objects have low or zero value. In this paper we explored accuracy of various similarity functions using the Partitioning Around Medoids (PAM) clustering algorithm. We tested similarity functions on several data sets to determine their ability to correctly predict the class labels. We also examined the applicability of various similarity functions to different types of data sets.

kays research

 

The Distributed Esteemed Endorser Review: A Novel Approach to Participant Assessment in MOOCs

By: Jennifer S. Kay, Tyler J. Nolan, Thomas M. Grello

Abstract:

One of the most challenging aspects of developing a Massive Open Online Course (MOOC) is designing an accurate method to effectively assess participant knowledge and skills. The Distributed Esteemed Endorser Review (DEER) approach has been developed as an alternative for those MOOCs where traditional approaches to assessment are not appropriate. In DEER, course projects are certified in-person by an "Esteemed Endorser", an individual who is typically senior in rank to the student, but is not necessarily an expert in the course content. Not only does DEER provide a means to certify that course goals have been met, it also provides MOOC participants with the opportunity to share information about what they have learned with others at the local level.

 

 

 

Figure 1

 

 

 

Manifold Learning for Multivariate Variable-length Sequences with an Application to Similarity Search

By: Shen-Shyang Ho

 Abstract:

Twin support vector machine (TSVM), least squares TSVM (LSTSVM) and energy-based LSTSVM (ELS-TSVM) satisfy only empirical risk minimization principle. Moreover, the matrices in their formulations are always positive semi-definite. To overcome these problems, we propose in this paper a robust energy-based least squares twin support vector machine algorithm, called RELS-TSVM for short. Unlike TSVM, LSTSVM and ELS-TSVM, our RELS-TSVM maximizes the margin with a positive definite matrix formulation and implements the structural risk minimization principle which embodies the marrow of statistical learning theory. Furthermore, RELS-TSVM utilizes energy parameters to reduce the effect of noise and outliers. Experimental results on several synthetic and real-world benchmark datasets show that RELS-TSVM not only yields better classification performance but also has a lower training time compared to ELS-TSVM, LSPTSVM, LSTSVM, TBSVM and TSVM.