Increasing amounts of data is collected in most areas of research and application. The degree to which this data can be accessed, analyzed, and retrieved, is a decisive in obtaining progress in fields such as scientific research or industrial production. We present a novel methodology supporting content-based retrieval and exploratory search in repositories of multivariate research data. In particular, our methods are able to describe two-dimensional functional dependencies in research data, e.g. the relationship between ination and unemployment in economics. Our basic idea is to use feature vectors based on the goodness-of-fit of a set of regression models to describe the data mathematically. We denote this approach Regressional Features and use it for content-based search and, since our approach motivates an intuitive definition of interestingness, for exploring the most interesting data. We apply our method on considerable real-world research datasets, showing the usefulness of our approach for user-centered access to research data in a Digital Library system.
39400 data points