Abstract
In autonomous driving, prediction tasks address complex spatio-temporal data. This article describes the examination of Recurrent Neural Networks (RNNs) for object trajectory prediction in the image space. The proposed methods enhance the performance and spatio-temporal prediction capabilities of Recurrent Neural Networks. Two different data augmentation strategies and a hyperparameter search are implemented for this purpose. A conventional data augmentation strategy and a Generative Adversarial Network (GAN) based strategy are analyzed with respect to their ability to close the generalization gap of Recurrent Neural Networks. The results are then discussed using single-object tracklets provided by the KITTI Tracking Dataset. This work demonstrates the benefits of augmenting spatio-temporal data with GANs.
Zusammenfassung
Im autonomen Fahren sind Vorhersagen aus komplexen räumlich-zeitlichen Daten notwendig. Dieser Artikel beschreibt die Untersuchung von Rekurrenten Neuralen Netzen (RNNs) zur Trajektorienvorhersage von Objekten im Bildraum. Die vorgeschlagenen Methoden verbessern die räumlich-zeitliche Vorhersagefähigkeit von Rekurrenten Neuronalen Netzen. Zu diesem Zweck werden zwei verschiedene Datenaugmentierungsstrategien und eine Hyperparametersuche implementiert. Eine konventionelle Datenaugmentierung und ein Generative Adversarial Network (GAN) werden auf ihre Fähigkeit hin analysiert, die Generalisierungslücke von Rekurrenten Neuronalen Netzen zu schließen. Die Ergebnisse werden unter Verwenden von Einzelobjekt-Trajektorien aus dem KITTI-Tracking Datensatz diskutiert. Diese Arbeit zeigt die Vorteile der Erweiterung von räumlich-zeitlichen Daten mit GANs.
About the authors
M. Sc. Mark Schutera is doctoral student in the field of deep learning for autonomous driving in the “Algorithms and Machine Learning Perception System” group at ZF Friedrichshafen AG and the Institute for Automation and Applied Computer Science at the Karlsruhe Institute of Technology.Research Interests: Deep Learning, autonomous driving, computer vision, data analytics and image processing
Prof. Dr. rer. nat. Stefan Elser works as professor for autonomous driving at the Hochschule Ravensburg-Weingarten.Research Interests: Machine learning, object detection, sensor fusion and their applications in autonomous driving
Dr. rer. nat. Jochen Abhau is team leader “Algorithms and Machine Learning Perception System” at ZF Friedrichshafen.Research Interests: Machine learning, image processing, data analytics, deep learning and autonomous driving
Apl. Prof. Dr.-Ing. Ralf Mikut is head of the research area “Automated Image and Data Analysis” of the Institute for Automation and Applied Computer Science at the Karlsruhe Institute of Technology.Research Interests: Machine learning, image processing, data analytics, computational intelligence, various applications in engineering and life sciences
PD Dr.-Ing. Markus Reischl is head of the research group „Machine Learning for High-Throughput and Mechatronics“ of the Institute for Automation and Applied Computer Science at the Karlsruhe Institute of Technology.Research Interests: Man-machine interfaces, image processing, machine learning, data analytics
Acknowledgment
With thanks to Katherine Quinlan-Flatter for proofreading the article.
References
1. Adamy, J.; Willert, V.: Cars become robots. Automatisierungstechnik 66.Search in Google Scholar
2. Andriluka, M.; Roth, S.; Schiele, B.: Monocular 3D pose estimation and tracking by detection. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, p. 623–630, 2010.10.1109/CVPR.2010.5540156Search in Google Scholar
3. Batz, T.; Watson, K.; Beyerer, J.: Recognition of dangerous situations within a cooperative group of vehicles. In: 2009 IEEE Intelligent Vehicles Symposium, p. 907–912, 2009.10.1109/IVS.2009.5164400Search in Google Scholar
4. Bergstra, J.; Bengio, Y.: Random search for hyper-parameter optimization. Journal of Machine Learning Research 13 (2012) Feb, p. 281–305.Search in Google Scholar
5. Dequaire, J.; Rao, D.; Ondruska, P.; Wang, D. Z.; Posner, I.: Deep tracking on the move: Learning to track the world from a moving vehicle using Recurrent Neural Networks. CoRR abs/1609.09365 (2017).Search in Google Scholar
6. Ess, A.; Leibe, B.; Schindler, K. van Gool, L.: A mobile vision system for robust multi-person tracking. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR’08), IEEE Press, 2008.10.1109/CVPR.2008.4587581Search in Google Scholar
7. Geiger, A.; Lenz, P.; Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: Conference on Computer Vision and Pattern Recognition (CVPR), 2012.10.1109/CVPR.2012.6248074Search in Google Scholar
8. Gers, F. A.; Schmidhuber, J. A.; Cummins, F. A.: Learning to forget: Continual prediction with LSTM. Neural Computation 12 (2000) 10, p. 2451–2471.10.1162/089976600300015015Search in Google Scholar PubMed
9. Glorot, X.; Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: In Proceedings of the International Conference on Artificial Intelligence and Statistics, 2010.Search in Google Scholar
10. Goodfellow, I.; Bengio, Y.; Courville, A.: Deep learning. MIT Press, http://www.deeplearningbook.org, 2016.Search in Google Scholar
11. Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y.: Generative Adversarial Nets. In: Advances in Neural Information Processing Systems 27, p. 2672–2680, Curran Associates, Inc., 2014.Search in Google Scholar
12. Goodfellow, I. J.: NIPS 2016 tutorial: Generative Adversarial Networks. CoRR abs/1701.00160 (2017).Search in Google Scholar
13. Halevy, A.; Norvig, P.; Pereira, F.: The unreasonable effectiveness of data. IEEE Intelligent Systems 24 (2009) 2, p. 8–12.10.1109/MIS.2009.36Search in Google Scholar
14. Hochreiter, S.; Schmidhuber, J.: Long short-term memory. Neural Computation 9 (1997), p. 1735–1780.10.1162/neco.1997.9.8.1735Search in Google Scholar PubMed
15. Indrabayu; Bakti, R. Y.; Areni, I. S.; Prayogi, A. A.: Vehicle detection and tracking using Gaussian Mixture Model and Kalman Filter. In: 2016 International Conference on Computational Intelligence and Cybernetics, p. 115–119, 2016.10.1109/CyberneticsCom.2016.7892577Search in Google Scholar
16. Ioffe, S.; Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning, p. 448–456, 2015.Search in Google Scholar
17. Janai, J.; Güney, F.; Behl, A.; Geiger, A.: Computer vision for autonomous vehicles: Problems, datasets and state-of-the-art. CoRR abs/1704.05519 (2017).10.1561/9781680836899Search in Google Scholar
18. Kalman, R. E.: A new approach to linear filtering and prediction problems. Transactions of the ASME–Journal of Basic Engineering 82 (1960) Series D, p. 35–45.10.1115/1.3662552Search in Google Scholar
19. Karpathy, A.: CS231n: Convolutional Neural Networks for visual recognition. http://cs231n.github.io/neural-networks-3/, access: 20.01.2018.Search in Google Scholar
20. Kiefer, J.; Wolfowitz, J.: Stochastic estimation of the maximum of a regression function. Ann. Math. Statist. 23 (1952) 3, p. 462–466.10.1214/aoms/1177729392Search in Google Scholar
21. Kingma, D. P.; Ba, J. L.: Adam: A method for stochastic optimization. In: Proc. 3rd Int. Conf. Learn. Representations, 2014.Search in Google Scholar
22. Krebs, S.; Duraisamy, B.; Flohr, F.: A survey on leveraging deep neural networks for object tracking. In: 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC), p. 411–418, 2017.10.1109/ITSC.2017.8317904Search in Google Scholar
23. Lipton, Z. C.: A critical review of Recurrent Neural Networks for sequence learning. CoRR abs/1506.00019 (2015).Search in Google Scholar
24. Milan, A.; Rezatofighi, S. H.; Dick, A. R.; Reid, I. D.; Schindler, K.: Online multi-target tracking using Recurrent Neural Networks. In: Association for the Advancement of Artificial Intelligence Conference on Artificial Intelligence, Bd. 2, p. 4, 2017.10.1609/aaai.v31i1.11194Search in Google Scholar
25. Nelles, O.: Nonlinear system identification. Measurement Science and Technology 13 (2002) 4, p. 646.10.1088/0957-0233/13/4/709Search in Google Scholar
26. Ning, G.; Zhang, Z.; Huang, C.; Ren, X.; Wang, H.; Cai, C.; He, Z.: Spatially supervised recurrent Convolutional Neural Networks for visual object tracking. In: Circuits and Systems (ISCAS), 2017 IEEE International Symposium on, p. 1–4, IEEE, 2017.10.1109/ISCAS.2017.8050867Search in Google Scholar
27. NVIDIA Corporation: Nvidia TESLA P100 GPU accelerator. Techn. Ber., NVIDIA Corporation, 2016.Search in Google Scholar
28. Patino, L.; Nawaz, T.; Cane, T.; Ferryman, J.: PETS 2017: Dataset and challenge. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), p. 2126–2132, 2017.10.1109/CVPRW.2017.264Search in Google Scholar
29. Perez, L.; Wang, J.: The effectiveness of data augmentation in image classification using deep learning. CoRR abs/1712.04621 (2017).Search in Google Scholar
30. Rehder, E.; Wirth, F.; Lauer, M.; Stiller, C.: Pedestrian prediction by planning using deep neural networks. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), p. 1–5, IEEE, 2018.10.1109/ICRA.2018.8460203Search in Google Scholar
31. Rumelhart, D. E.; Hinton, G. E.; Williams, R. J.: Learning representations by back-propagating errors. In: Neurocomputing: Foundations of research, p. 696–699, Cambridge, MA, USA: MIT Press, 1988.Search in Google Scholar
32. Srivastava, N.; Hinton, G.; Krizhevsky, A.; Sutskever, I.; Salakhutdinov, R.: Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research 15 (2014), p. 1929–1958.Search in Google Scholar
33. Stratonovich, R. L.: Conditional Markov processes. Theory of Probability & Its Applications 5 (1960) 2, p. 156–178.10.1137/1105015Search in Google Scholar
34. Tzeng, E.; Hoffman, J.; Saenko, K.; Darrell, T.: Adversarial discriminative domain adaptation. In: Computer Vision and Pattern Recognition (CVPR), Bd. 1, p. 4, 2017.10.1109/CVPR.2017.316Search in Google Scholar
35. Welch, G.; Bishop, G.: An introduction to the Kalman filter. In: Technical Report, University of North Carolina at Chapel Hill, 2006.Search in Google Scholar
36. Werling, M.; Gröll, L.; Bretthauer, G.: Invariant trajectory tracking with a full-size autonomous road vehicle. IEEE Transactions on Robotics 26 (2010) 4, p. 758–765.10.1109/TRO.2010.2052325Search in Google Scholar
37. Wu, Y.; Lim, J.; Yang, M. H.: Object tracking benchmark. IEEE Transactions on Pattern Analysis and Machine Intelligence 37 (2015) 9, p. 1834–1848.10.1109/TPAMI.2014.2388226Search in Google Scholar PubMed
© 2019 Walter de Gruyter GmbH, Berlin/Boston