-->
[See links below for slides and recorded videos.] Visual instance-level recognition and retrieval are fundamental tasks in computer vision. Despite the recent advances in this field, many techniques have been evaluated on a limited number of domains, with small number of classes. We believe that the research community can benefit from a new suite of datasets and associated challenges, to improve the understanding about the limitations of current technology, and with an opportunity to introduce new techniques. The Instance-Level Recognition (ILR) Workshop is a follow-up of two successful editions of the Landmark Recognition Workshop at and . While the previous editions focused solely on landmarks, our Instance-Level Recognition workshop will consider three domains: artworks, landmarks and products.
Andre Araujo
Ping Luo (Associate Professor, University of Hong Kong)
Instance Detection, Segmentation, Landmark Estimation and Beyond
Tobias Weyand, Bingyi Cao and Cam Askew
Teams keetar (#1), bysj (#2), Open Neural Network Exchange (#5).
(Hosted by Torsten Sattler and Bohyung Han)
Diane Larlus (Senior Research Scientist, NAVER LABS Europe)
(Hosted by Ondrej Chum) Aug 28, 22:00 - 22:45 UTC+1
Tobias Weyand, Bingyi Cao and Cam Askew
Guangxing Han and Giorgos Tolias
Xu Zhang
Andre Araujo
Senior Research Scientist at NAVER LABS Europe
In the first part of the talk, we will move beyond instance-level retrieval and consider the task of semantic image retrieval in complex scenes, where the goal is to retrieve images that share the same semantics as the query image. Despite being more subjective and more complex, one can show that the task of semantically ranking visual scenes is consistently implemented across a pool of human annotators, and that suitable embedding spaces can be learnt for this task of semantic retrieval. The second part of the presentation will focus on cross-modal retrieval. More specifically, we will consider the problem of cross-modal fine-grained action retrieval between captions and videos. Cross-modal retrieval is commonly achieved through learning a shared embedding space that can indifferently embed modalities. In this part we will show how to enrich the embedding space by disentangling parts-of-speech (PoS) in the accompanying captions.
Associate Professor at the University of Hong Kong
This talk will cover three general topics of instance-level visual perception including fashion image understanding, whole-body human landmarks (face, hand and body keypoints) estimation, and instance detection and segmentation. First, we will introduce a new perspective of modelling object mask in the polar space by proposing PolarMask, which is an efficient single-stage instance segmentation pipeline. Secondly, we will introduce a new benchmark for full-body human landmark estimation that predicts key points of human face, hand and torso simultaneously. Thirdly, we will apply human segmentation and pose estimation for highly realistic fashion image generation.
Software Engineer, Google (Primary Contact)
Software Engineer, Google
Software Engineer, Google
Associate Professor, Czech Technical University
Associate Professor, Seoul National University
Postdoctoral Research Scientist, Columbia University
Associate Professor, Chalmers University of Technology
Software Engineer, Google
Assistant Professor, Czech Technical University
Software Engineer, Google
Applied Scientist, Amazon Alexa
© 2021 ILR2020
We thank Jalpc for the jekyll template