-->
Visual instance-level recognition and retrieval are fundamental tasks in computer vision. Despite the recent advances in this field, many techniques have been evaluated on a limited number of domains, with small number of classes. We believe that the research community can benefit from a new suite of datasets and associated challenges, to improve the understanding about the limitations of current technology, and with an opportunity to introduce new techniques. This year, we propose the first Universal Image Embedding Challenge, where the goal is to develop image representations that work well across several domains combined. The Instance-Level Recognition (ILR) Workshop is a follow-up of four successful editions of our previous workshops — the first two having focused only on landmark recognition (, ), and the latest two expanded to two extra domains (artworks and products) (, ).Our workshop location is the Grand Ballroom B, at the David Intercontinental hotel
Andre Araujo
Oct. 24th, 9:00am-9:10am (GMT+3)
Vicente Ordonez, Rice University
(Hosted by Tobias Weyand) Oct. 24th, 9:10am-9:40am (GMT+3)
Bingyi Cao
Oct. 24th, 9:40am-10:10am (GMT+3)
Minsu Cho, POSTECH
(Hosted by Bohyung Han) Oct. 24th, 10:10am-10:40am (GMT+3)
Xu Zhang
Oct. 24th, 11:10am-11:30am (GMT+3)
Jon Almazan, Byungsoo Ko, Geonmo Gu, Diane Larlus and Yannis Kalantidis
(Hosted by Torsten Sattler) Oct. 24th, 11:30am-11:40am (GMT+3)
Shraman Pramanick, Ewa Nowara, Joshua Gleason, Carlos Castillo and Rama Chellappa
(Hosted by Torsten Sattler) Oct. 24th, 11:40am-11:50am (GMT+3)
Ioannis Kakogeorgiou, Spyros Gidaris, Bill Psomas, Yannis Avrithis, Andrei Bursuc, Konstantinos Karantzalos and Nikos Komodakis
(Hosted by Torsten Sattler) Oct. 24th, 11:50am-12:00pm (GMT+3)
Mathilde Caron, Google Research
(Hosted by Giorgos Tolias) Oct. 24th, 12:00pm-12:30pm (GMT+3)
Andre Araujo
Oct. 24th, 12:30pm-12:35pm (GMT+3)
Associate Professor of CSE & AI, POSTECH
Few-shot learning has been actively studied for visual recognition tasks such as image classification and semantic segmentation. Existing methods, however, are limited in understanding diverse levels of visual cues and analyzing fine-grained correspondence relations between the query and the support images. This prevents few-shot learning from generalizing to and evaluating on more realistic cases in the wild. In this talk, I will introduce our recent work for object-aware few-shot learning that tackles the challenge by leveraging multi-level feature correlations and high-order convolution/self-attention. I will also present an integrative few-shot learning framework that combines two conventional few-shot learning problems, few-shot classification and segmentation, and generalizes them to more realistic episodes with arbitrary image pairs, where each target class may or may not be present in the query. In experiments, the proposed method shows promising performance on the joint problem of classification and segmentation and also achieves the state of the art on standard few-shot segmentation benchmarks.
Associate Professor of Computer Science, Rice University
Visual representation learning has made great progress in recent years. We can reuse models that can map images to a common semantic space with a high degree of accuracy, however, using these representations for searching images among large collections is not enough. When users search for images in their personal image collections, on an e-commerce site, or on the web, they generally search with very different intentions. I will outline here some desirable properties of retrieval models and outline some of our work in instance-level image retrieval with Reranking Transformers (RRTs) (https://arxiv.org/abs/2103.12236), iterative and interpretable retrieval using a Drill-down approach (https://arxiv.org/abs/1911.03826), and our more recent work on multilingual and multimodal learning (https://aclanthology.org/2020.findings-emnlp.328, https://arxiv.org/abs/2206.15462).
Research Scientist at Google Research
Self-supervised learning (SSL) consists in training neural network systems without using any human annotations. In this talk, I will present how instance-level recognition has inspired the recent successful approaches in SSL. Secondly, I will show how self-supervised models can give back to the instance-level recognition community by providing features more suited to tackle challenging image retrieval benchmarks than their class-level supervised counterparts.
Google Research (Primary Contact)
Google Research
Czech Technical University
Osaka University
Seoul National University
Columbia University
Columbia University
Amazon Alexa
Czech Technical University
Google Research
Amazon Alexa
Czech Technical University
Amazon Alexa
© 2023 ILR2022
We thank Jalpc for the jekyll template