Longest common subsequences (LCS) is a standard similarity measure for sequences. We argue that LCS ignores the structural commonality in LCS residues, failing to rerpesent the full-spectrum similarity between sequences. To address this weakness, we introduce recursive longest common subsequence (rLCS) that generalizes LCS, aggregating the structural commonality of LCS residues resursively.


Datasets for Recursive Longest Common Subsequence (rLCS):

 

Farm-Ads 100 Random Samples

Mushroom 200 Random Samples

20 Newsgroups 40 Random Samples Each Class

 

All data sets are space delimited. Rightmost attribute is the class.