Learning useful and predictable features from past workloads and exploiting them well is a major source of improvement in many operating system problems. We review known parallel workload features, and argue that the correct approach for future on-line algorithm design as well as workload modeling is user- and session-based modeling, instead of analyzing jobs directly as done today. We then provide statistically sound answers to two basic questions: Which user and session features are central enough to be potentially useful, answered using Principal Component Analysis, and which user and session classes exist and how they can be identified on-line, answered using K-means clustering. We identify variable sets that explain over 80% of the variance between sessions and between users, and also identify five stable session classes (clusters) and four stable user classes. Our analysis is based on logs from seven different parallel supercomputers, spanning over 87 months, which are analyzed ...