Instance-Level Recognition Workshop at ICCV'21


Visual instance-level recognition and retrieval are fundamental tasks in computer vision. Despite the recent advances in this field, many techniques have been evaluated on a limited number of domains, with small number of classes. We believe that the research community can benefit from a new suite of datasets and associated challenges, to improve the understanding about the limitations of current technology, and with an opportunity to introduce new techniques.

This year, our workshop will focus on three different domains: artworks, landmarks and products.

The Instance-Level Recognition (ILR) Workshop is a follow-up of three successful editions of our previous workshops — the first two having focused only on landmark recognition (CVPRW18, CVPRW19), and the most recent edition focusing on other domains as well (ECCVW20).

Workshop Topics

Landmark Recognition

Recognize landmarks in images.
Challenge Website

Landmark Retrieval

Retrieve relevant landmark images from a large-scale database.
Challenge Website

Artwork Recognition

Recognize artworks in images.

Product Retrieval

Retrieve relevant product images from a large-scale database.

Workshop Schedule

Welcome Remarks

Andre Araujo
Video , Slides

Oct 11, 13:00 - 13:15 EDT

Landmark Challenges: New Fairer Dataset

Zu Kim, Google Research
Video , Slides

Oct 11, 13:15 - 13:30 EDT

Landmark Recognition Challenge: Overview

Cam Askew, Bingyi Cao, Tobias Weyand
Video , Slides

Oct 11, 13:30 - 13:40 EDT

Landmark Recognition Challenge: Winner Presentations

(Hosted by Bohyung Han) Video Oct 11, 13:40 - 14:00 EDT

Invited Talk 1

Prof. Laura Leal-Taixé, Technical University of Munich
Learning Intra-Batch Connections for Deep Metric Learning

(Hosted by Torsten Sattler) Video Oct 11, 14:00 - 14:45 EDT

Landmark Retrieval Challenge: Overview

Cam Askew, Bingyi Cao, Tobias Weyand
Video , Slides

Oct 11, 14:45 - 14:55 EDT

Landmark Retrieval Challenge: Winner Presentations

(Hosted by Bohyung Han) Video Oct 11, 14:55 - 15:15 EDT


Oct 11, 15:15 - 16:00 EDT

Interactive Product Retrieval: Overview and Presentations

Xu Zhang and team
Video , Slides

Oct 11, 16:00 - 16:30 EDT

Artwork Recognition: Overview and Presentations

Nikolaos Ypsilantis
Video , Slides

Oct 11, 16:30 - 17:00 EDT

Invited Talk 2

Dr. Rong Jin, Alibaba
Improving Instance Classification by Representation Learning

(Hosted by Jack Sim) Video Oct 11, 17:00 - 17:45 EDT

Closing Remarks

Andre Araujo
Video , Slides

Oct 11, 17:45 - 18:00 EDT

Invited Speakers

Laura Leal-Taixé

Professor at Technical University of Munich

Learning Intra-Batch Connections for Deep Metric Learning

The goal of metric learning is to learn a function that maps samples to a lower-dimensional space where similar samples lie closer than dissimilar ones. Particularly, deep metric learning utilizes neural networks to learn such a mapping. Most approaches rely on losses that only take the relations between pairs or triplets of samples into account, which either belong to the same class or two different classes. However, these methods do not explore the embedding space in its entirety. To this end, we propose an approach based on message passing networks that takes all the relations in a mini-batch into account. We refine embedding vectors by exchanging messages among all samples in a given batch allowing the training process to be aware of its overall structure. Since not all samples are equally important to predict a decision boundary, we use an attention mechanism during message passing to allow samples to weigh the importance of each neighbor accordingly. We achieve state-of-the-art results on clustering and image retrieval on the CUB-200-2011, Cars196, Stanford Online Products, and In-Shop Clothes datasets.

Prof. Dr. Laura Leal-Taixé is a tenure-track professor (W2) at the Technical University of Munich, leading the Dynamic Vision and Learning group. Before that, she spent two years as a postdoctoral researcher at ETH Zurich, Switzerland, and a year as a senior postdoctoral researcher in the Computer Vision Group at the Technical University in Munich. She obtained her PhD from the Leibniz University of Hannover in Germany, spending a year as a visiting scholar at the University of Michigan, Ann Arbor, USA. She pursued B.Sc. and M.Sc. in Telecommunications Engineering at the Technical University of Catalonia (UPC) in her native city of Barcelona. She went to Boston, USA to do her Masters Thesis at Northeastern University with a fellowship from the Vodafone foundation. She is a recipient of the Sofja Kovalevskaja Award of 1.65 million euros for her project socialMaps.

Rong Jin

Head of Machine Intelligence Technologies at Alibaba

Improving Instance Classification by Representation Learning

We have witnessed the growing significance of representation learning in computer vision. In this talk, we will present our studies of exploiting various methods of representation learning, including supervised/semi-supervised/unsupervised representation learning and representation learning by leveraging side information, in the context of large-scale image search and classification, long-tailed image classification, and ReID. On one hand, our large-scale studies show that it is important for methods to exploit all the possible hints to effectively guide the construction of appropriate representations for images. On the other hand, it is also crucial to carefully explore the supervised information for representation learning, as they can be noisy, inaccurate, or incomplete in terms of describing rich visual content.

Rong Jin is currently an associate director of DAMO academy at Alibaba, leading the research and development of state-of-the-art AI technologies. Before joining Alibaba, he was a faculty member of the Computer and Science Engineering Dept. at Michigan State University from 2003 to 2015. His research is focused on statistical machine learning and its application to large-scale data analysis. He published over 300 technique papers, mostly on the top conferences and prestigious journals. He is an associate editor of IEEE Transaction at Pattern Analysis and Machine Intelligence (TPAMI) and ACM Transaction at Knowledge Discovery from Data. Dr. Jin holds Ph.D. in Computer Science from Carnegie Mellon University. He received the NSF career award in 2006, and the best paper award from COLT in 2012.


Andre Araujo

Google Research (Primary Contact)

Cam Askew

Google Research

Bingyi Cao

Google Research

Ondrej Chum

Czech Technical University

Noa Garcia

Osaka University

Bohyung Han

Seoul National University

Guangxing Han

Columbia University

Varsha Hedau


Pradeep Natarajan


Robinson Piramuthu


Jack Sim

Google Research

Giorgos Tolias

Czech Technical University

Tobias Weyand

Google Research

Yue Wu


Xu Zhang

Amazon Alexa

Torsten Sattler

Czech Technical University in Prague

© 2021 ILR2021

We thank Jalpc for the jekyll template