This paper explores a method to algorithmically distinguish case-specific facts from potentially reusable or adaptable elements of cases in a textual case-based reasoning (TCBR) system. In the legal domain, documents often contain casespecific facts mixed with case-neutral details of law, precedent, conclusions the attorneys reach by applying their interpretation of the law to the case facts, and other aspects of argumentation that attorneys could potentially apply to similar situations. The automated distinction of these two categories, namely facts and other elements, has the potential to improve quality of automated textual case acquisition. The goal is ultimately to distinguish case problem from solution. To separate fact from other elements, we use an information gain (IG) algorithm to identify words that serve as efficient markers of one or the other. We demonstrate that this technique can successfully distinguish case-specific fact paragraphs from others, and propose future work...
Jason M. Proctor, Ilya Waldstein, Rosina Weber