Authors:
Nikolaos Passalis
;
Pavlos Tosidis
;
Theodoros Manousis
and
Anastasios Tefas
Affiliation:
Computational Intelligence and Deep Learning Group, AIIA Lab., Department of Informatics, Aristotle University of Thessaloniki, Greece
Keyword(s):
Active Perception, Deep Learning, Active Vision, Active Robotic Perception.
Abstract:
Deep Learning (DL) has brought significant advancements in recent years, greatly enhancing various challenging computer vision tasks. These tasks include but are not limited to object detection and recognition, scene segmentation, and face recognition, among others. DL’s advanced perception capabilities have also paved the way for powerful tools in the realm of robotics, resulting in remarkable applications such as autonomous vehicles, drones, and robots capable of seamless interaction with humans, such as collaborative manufacturing. However, despite these remarkable achievements in DL within these domains, a significant limitation persists: most existing methods adhere to a static inference paradigm inherited from traditional computer vision pipelines. Indeed, DL models typically perform inference on a fixed and static input, ignoring the fact that robots possess the capability to interact with their environment to gain a better understanding of their surroundings. This process, kn
own as ”active perception”, closely mirrors how humans and various animals interact and comprehend their environment. For instance, humans tend to examine objects from different angles, when being uncertain, while some animals have specialized muscles that allow them to orient their ears towards the source of an auditory signal. Active perception offers numerous advantages, enhancing both the accuracy and efficiency of the perception process. However, incorporating deep learning and active perception in robotics also comes with several challenges, e.g., the training process often requires interactive simulation environments and dictates the use of more advanced approaches, such as deep reinforcement learning, the deployment pipelines should be appropriately modified to enable control within the perception algorithms, etc. In this paper, we will go through recent breakthroughs in deep learning that facilitate active perception across various robotics applications, as well as provide key application examples. These applications span from face recognition and pose estimation to object detection and real-time high-resolution analysis.
(More)