CNNs & Transfer Learning for furniture recognition

University of Copenhagen, group work and Kaggle competition

Project description

The model was built on approx. 200.000 images of furniture to distinguish with over 80% accuracy between 128 classes of furniture and accessories.

Overview

iMaterialist Challenge (Furniture) at FGVC5 introduced a large dataset of furniture for the purpose of performing classification of 128 classes. We’ve used well known pre-trained CNN architectures and ended up with over 80% accuracy on the 175th place out of 428 teams and a maximum grade in the course.

This, together with data being merged from different modalities, makes interpreting the results a challenge and opens up opportunities for introducing better visualization techniques that could enable deeper insights, better understanding of the inner workings of machine learning, and ultimately better models. The project was part of a Large Scale Data Analysis course.

Technologies

Keras, CNNs, Transfer Learning, Ubuntu in Microsoft Azure Cloud

Technical Details

We’ve tried many approaches from which we’ve converged on using a full transfer learning approach that combined bottleneck features from 2 different architectures and used test-time augmentation to improve the results further.

The final model included ResNet50 and DenseNet models proposing predictions in 2 different modes: with and without test-time augmentation, thus the final prediction was a weighted average of probabilities predicted by 4 different modelling pipelines.

Through the project we’ve worked with Keras using GPU-accelerated Tensorflow in the backend on an Ubuntu machine that we’ve configured ourselves in the Microsoft Azure cloud. An especially challenging part of the project was managing the data in the cloud and working with it efficiently.

A report with more details regarding the process and model details is available HERE.

My contribution

I’ve built our transfer learning approach from grounds up, built a framework for extracting and storing bottleneck features, created a data generator because of memory issues, performed test-time augmentation, and created the ensemble that became our final submission.