Concept location techniques are designed to help isolate sections of source code that relate to specific concepts. Blind Signal Separation techniques like Singular Value Decomposition and Latent Semantic Indexing can be used as a way to identify related sections of source code. This paper explores a related technique called Independent Component Analysis that has the added benefit of identifying statistically independent signals in text, as opposed to ones that are just decorrelated. We describe a tool that we have developed to explore how ICA performs when analysing source code, and show how the technique can be used to perform unsupervised concept location.
Scott Grant, James R. Cordy, David B. Skillicorn