Abstract
Given a picture taken somewhere in the world, automatic geo-localization of such an image is an extremely useful task especially for historical and forensic sciences, documentation purposes, organization of the world’s photographs and intelligence applications. While tremendous progress has been made over the last years in visual location recognition within a single city, localization in natural environments is much more difficult, since vegetation, illumination, seasonal changes make appearance-only approaches impractical. In this work, we target mountainous terrain and use digital elevation models to extract representations for fast visual database lookup. We propose an automated approach for very large scale visual localization that can efficiently exploit visual information (contours) and geometric constraints (consistent orientation) at the same time. We validate the system at the scale of Switzerland (40,000 \(\hbox {km}^2\)) using over 1000 landscape query images with ground truth GPS position.
Similar content being viewed by others
Notes
Synthetic experiments verified that taking the photo from ten or fifty meters above the ground does not degrade recognition besides very special cases like standing very close to a small wall.
References
Baatz, G., Köser, K., Chen, D., Grzeszczuk, R., & Pollefeys, M. (2012). Leveraging 3d city models for rotation invariant place-of-interest recognition. International Journal of Computer Vision, 96, 315–334. Special Issue on Mobile Vision.
Baatz, G., Saurer, O., Köser, K., & Pollefeys, M. (2012). Large scale visual geo-localization of images in mountainous terrain. In Proceedings of European Conference on Computer Vision (ECCV) (pp. 517–530).
Baboud, L., Cadík, M., Eisemann, E., & Seidel, H.-P. (2011). Automatic photo-to-terrain alignment for the annotation of mountain pictures. In Proceedings of Computer Vision and Pattern Recognition (CVPR) (pp. 41–48).
Bansal, M., & Daniilidis, K. (2014). Geometric urban geo-localization. In Proceedings of Computer Vision and Pattern Recognition (CVPR) (pp. 3978–3985).
Bazin, J.-C., Kweon, I., Demonceaux, C., & Vasseur, P. (2009). Dynamic programming and skyline extraction in catadioptric infrared images. In Proceedings of International Conference on Robotics and Automation (ICRA) (pp. 409–416).
Blake, A., Rother, C., Brown, M., Perez, P., & Torr, P. (2004). Interactive image segmentation using an adaptive gmmrf model. In Proceedings of European Conference on Computer Vision (ECCV) (pp. 428–441).
Brown, M., & Lowe, D. G. (2007). Automatic panoramic image stitching using invariant features. International Journal of Computer Vision, 74, 59–73.
Chen, D., Baatz, G., Köser, K., Tsai, S., Vedantham, R., Pylvanainen, T., Roimela, K., Chen, X., Bach, J., Pollefeys, M., Girod, B., & Grzeszczuk, R. (2011). City-scale landmark identification on mobile devices. In Proceedings of Computer Vision and Pattern Recognition (CVPR).
Comaniciu, D., Meer, P., & Member, S. (2002). Mean shift: A robust approach toward feature space analysis. Transactions on Pattern Analysis and Machine Intelligence, 24, 603–619.
Cozman, F. (1997). Decision Making Based on Convex Sets of Probability Distributions: Quasi-Bayesian Networks and Outdoor Visual Position Estimation. PhD thesis, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA.
Cozman, F., & Krotkov, E. (1996). Position estimation from outdoor visual landmarks for teleoperation of lunar rovers. In WACV ’96 (pp. 156–161).
Friedman, J., Hastie, T., & Tibshirani, R. (2000). Additive logistic regression: A statistical view of boosting. The Annals of Statistics, 28, 337–407.
Hays, J., & Efros, A. A. (2008). im2gps: estimating geographic information from a single image. In Proceedings of Computer Vision and Pattern Recognition (CVPR).
Hussain, S. ul., & Triggs, B. (2012). Visual recognition using local quantized patterns. In Proceedings of European Conference on Computer Vision (ECCV).
Kolmogorov, V., & Boykov, Y. (2005). What metrics can be approximated by geo-cuts, or global optimization of length/area and flux. In Proceedings of International Conference on Computer Vision (ICCV) (pp. 564–571). Washington: DC, USA.
Ladicky, L., Russell, C., Kohli, P., & Torr, P. (2014). Associative hierarchical random fields. Transactions on Pattern Analysis and Machine Intelligence, 36(6), 1056–1077.
Ladicky, L., Zeisl, B., & Pollefeys, M. (2014) Discriminatively trained dense surface normal estimation. In Proceedings of European Conference on Computer Vision (ECCV).
Lalonde, J.-F., Narasimhan, S. G., & Efros, A. A. (2010). What do the sun and the sky tell us about the camera? International Journal on Computer Vision, 88(1), 24–51.
Li, Y., Snavely, N., & Huttenlocher, D. P. (2010). Location recognition using prioritized feature matching. In Proceedings of European Conference on Computer Vision (ECCV) (pp. 791–804).
Lie, W.-N., Lin, T. C.-I., Lin, T.-C., & Hung, K.-S. (2005). A robust dynamic programming algorithm to extract skyline in images for navigation. Pattern Recognition Letters, 26(2), 221–230.
Lowe, D. G. (2004). Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision, 60(2), 91–110.
Malik, J., Belongie, S., Leung, T., & Shi, J. (2001). Contour and texture analysis for image segmentation. International Journal of Computer Vision, 43(1), 7–27.
Manay, S., Cremers, D., Hong, B.-W., Yezzi, A., & Soatto, S. (2006). Integral invariants for shape matching. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(10), 1602–1618.
Naval, P. C., Mukunoki, M., Minoh, M., & Ikeda, K. (1997). Estimating camera position and orientation from geographical map and mountain image. In 38th Pattern Sensing Group Research Meeting, Society of Instrument and Control Engineers (pp. 9–16).
Nistér, D., & Stewénius, H. (2006). Scalable recognition with a vocabulary tree. In Proceedings of Computer Vision and Pattern Recognition (CVPR) (pp. 2161–2168).
Ramalingam, S., Bouaziz, S., & Sturm, P. (2011). Pose estimation using both points and lines for geo-localization. In Proceedings of International Conference on Robotics and Automation (ICRA) (pp. 4716–4723).
Ramalingam, S., Bouaziz, S., & Sturm, P., & Brand, M. (2010). Skyline2gps: Localization in urban canyons using omni-skylines. In IROS 2010 (pp. 3816–3823).
Schindler, G., Brown, M., & Szeliski, R. (2007). City-scale location recognition. In Proceedings of Computer Vision and Pattern Recognition (CVPR) (pp. 1–7).
Shechtman, E., & Irani, M. (2007). Matching local self-similarities across images and videos. In Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR).
Shotton, J., Winn, J., Rother, C., & Criminisi, A. (2006). Textonboost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation. In Proceedings of European Conference on Computer Vision (ECCV) (pp. 1–15).
Sivic, J., & Zisserman, A. (2003) Video Google: A text retrieval approach to object matching in videos. In Proceedings of International Conference on Computer Vision (ICCV) (pp. 1470–1477).
Stein, F., & Medioni, G. (1995). Map-based localization using the panoramic horizon. Transaction on Robotics and Automation, 11(6), 892–896.
Talluri, R., & Aggarwal, J. (1992). Position estimation for an autonomous mobile robot in an outdoor environment. Transaction on Robotics and Automation, 8(5), 573–584.
Taneja, A., Ballan, L., & Pollefeys, M. (2012). Registration of spherical panoramic images with cadastral 3d models. In 3D Imaging, Modeling, Processing, Visualization and Transmission (3DIMPVT) (pp. 479–486).
Thompson, W. B., Henderson, T. C., Colvin, T. L., Dick, L. B., & Valiquette, C. M. (1993). Vision-based localization. In Image Understanding Workshop (pp. 491–498).
Vasilevskiy, A., & Siddiqi, K. (2002). Flux maximizing geometric flows. In Transactions on Pattern Analysis and Machine Intelligence (PAMI) (pp. 1565–1578).
Woo, J., Son, K., Li, T., Kim, G. S., & Kweon, I.-S. (2007). Vision-based uav navigation in mountain area. In MVA (pp. 236–239).
Yang, M., Kpalma, K., & Ronsin, J. (2008). A survey of shape feature extraction techniques. In P.-Y. Yin (Ed.), Pattern recognition (pp. 43–90).
Acknowledgments
This work has been supported through SNF Grant 127224 by the Swiss National Science Foundation. We also thank Simon Wenner for his help to render the DEMs and Hiroto Nagayoshi for providing the CH2 dataset. We also thank the anonymous reviewers for useful discussions and constructive feedback.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Edmond Boyer.
Rights and permissions
About this article
Cite this article
Saurer, O., Baatz, G., Köser, K. et al. Image Based Geo-localization in the Alps. Int J Comput Vis 116, 213–225 (2016). https://doi.org/10.1007/s11263-015-0830-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-015-0830-0