Nivan Ferreira, Jorge Poco, Huy T. Vo, Juliana Freire, Claudio T. Silva
Comparison of taxi trips from Lower Manhattan to JFK and LGA airports in May 2011. The query on the left selectstrips that occurred on Sundays, while the one on the right selects trips that occurred on Mondays. Users specify these queries byvisually selecting regions on the map and connecting them. In addition to inspecting the results depicted on the map, i.e., the dotscorresponding to pickups (blue) and dropoffs (orange) of the selected trips, they can also explore the results through other visualrepresentations. The scatter plots below the maps show the relationship between hour of the day and trip duration. Points in the plotsare colored according to the spatial constraint represented by the arrows between the regions: trips to JFK in blue, and trips to LGAin red. The plots show that many of the trips on Monday between 3PM and 5PM take much longer than trips on Sundays.
As increasing volumes of urban data are captured and become available, new opportunities arise for data-driven analysis that can lead to improvements in the lives of citizens through evidence-based decision making and policies. In this paper, we focus on a particularly important urban data set: taxi trips. Taxis are valuable sensors and information associated with taxi trips can provide unprecedented insight into many different aspects of city life, from economic activity and human behavior to mobility patterns. But analyzing these data presents many challenges. The data are complex, containing geographical and temporal components in addition to multiple variables associated with each trip. Consequently, it is hard to specify exploratory queries and to perform comparative analyses (e.g., compare different regions over time). This problem is compounded due to the size of the data - there are on average 500,000 taxi trips each day in NYC. We propose a new model that allows users to visually query taxi trips. Besides standard analytics queries, the model supports origin-destination queries that enable the study of mobility across the city. We show that this model is able to express a wide range of spatio-temporal queries, and it is also flexible in that not only can queries be composed but also different aggregations and visual representations can be applied, allowing users to explore and compare results. We have built a scalable system that implements this model which supports interactive response times; makes use of an adaptive level-of-detail rendering strategy to generate clutter-free visualization for large results; and shows hidden details to the users in a summary through the use of overlay heat maps. We present a series of case studies motivated by traffic engineers and economists that show how our model and system enable domain experts to perform tasks that were previously unattainable for them.
 title = {Visual Exploration of Big Spatio-Temporal Urban Data: A Study of New York City Cab Trips},
 author = {Nivan Ferreira AND Jorge Poco AND Huy T. Vo AND Juliana Freire AND Claudio T. Silva},
 journal = {IEEE Transactions on Visualization and Computer Graphics},
 year = {2013},
 volume = {19},
 number = {12},
 pages = {2149--2158},
 url = {},