The fashion industry is a visual world. Millions of images are displayed everyday by fashion commerce sites to serve consumers the latest trends and products. However, automatically categorizing and searching through large collections of images according to fine-grained attributes remains a challenge. In this talk we present our research on deep learning techniques to automatically identify fine-grained attributes in both images and text in the presence of incomplete and noisy data. We focus on attributes such as necklines, skirt and sleeve shapes, patterns, textures, colors and occasions. This task is especially useful to online stores who might want to automatically organize and mine visual items according to their attributes without human input. It is also useful for end users who wish to find specific items when there is no text available describing the target image. We compare the results of the deep learning approach with classical techniques such as Latent Dirichlet Allocation and Canonical Correlation Analysis. Our results show that it is possible to design algorithms that automatically “translate” visual concepts into text and vice-versa.
Susana Zoghbi received a PhD in Computer Science in December 2016 from the KU Leuven university in Belgium. Her research interests lie at the intersection of computer vision and natural language processing, and include deep learning, topic modelling and probabilistic graphical models. During her PhD, she developed latent-variable models for understanding language and images in social media and e-commerce data. She holds two Master’s degrees, one in Mechanical Engineering, where her research focused on human-robot interaction technologies, and one in Mathematical Physics, where she focused on gravitational fluctuations in Domain Wall Spacetimes. In 2014, she was awarded a Google Anita Borg Scholarship. More information can be found here.