The proposed Urbanformer combines the OneFormer vectorized output with a modified ViT probability vector output to predict street human perception between safe and unsafe.
Abstract
Over the past four decades, urban perception has become a vital area of research that intersects multiple fields, such as criminology, psychology, and urban planning. This interdisciplinary approach seeks to understand and interpret how people perceive urban environments and how these perceptions shape their behavior. The surge in data collection methods, driven by modern web technologies and services, has enabled researchers to apply techniques from various domains to better quantify and analyze urban perception. In this study, we present the UrbanFormer, a vision transformer-based model, to address the task of urban perception analysis, leveraging the widely-used Place Pulse 2.0 dataset. Our focus is on the safety category, a key issue in urban perception, while employing vision transformer and explainability methods to provide insights into the decision-making process behind perception analysis.
Materials
BibTeX
@inproceedings{2024-StreetView,
 title = {What Makes a Place Feel Safe? Analyzing Street View Images to Identify Relevant Visual Elements},
 author = {Felipe Moreno-Vera AND Bruno Brandoli AND Jorge Poco},
 booktitle = {International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT)},
 year = {2024},
 url = {http://www.visualdslab.com/papers/StreetView},
}