Automatic Chart Interpretation
Visualizations are commonly used to present quantitative information. They are ubiquitous in scientific articles, textbooks, economic reports, press articles, and web pages. In many cases, these visualizations are the only publicly available representation of the underlying data. When well designed, visualizations leverage human visual processing to convey information efficiently and effectively. But these representations are not intended for machine consumption. It is unfortunate, as centuries of publications (both print and online) represent data visually. This project aims to develop computational models for interpreting data-driven diagrams to extract the underlying data, the graphical marks, and mappings that link the data to mark attributes. Moreover, our results can enable a variety of applications. A straightforward application is to use this information to improve search engines by better-incorporating figures. Another application area is restyling or retargeting visualizations; this task is essential as many published charts exhibit poor perceptual design choices that may hamper understanding.