saurabh varshneya
Chief AI Scientist

ImagoBuddy, a Chatbot recognizing objects in images

Transfer learning and computer vision chatbot for object recognition

What if we could create a bot who can speak images and gifs as fluent as English and could recognize images in a live conversation ? This is the story of how we created ImagoBuddy who can recognize household objects like vacuum bags and bulbs.

The background

At BotSupply, we try to challenge the boundaries of applied AI and chatbots working together. In our quest to make smarter chatbots, we at BotSupply have created different kind of bots for many enterprise clients. This time, I with our computer vision team, wanted to explore more on how vision could empower bots, and we came up with ImagoBuddy.

The use case

The use-case could be a very simple one. For instance, we could take a starting point in an everyday situation like changing household products such as lightbulbs, a tap or a vacuum bag. The process is surprisingly troublesome, and if you’ve ever tried taking out your last vacuum bag, you know this.

First, we have find out what kind of product we have. Then, we need to find out where to get it, and how to order or which needed to be replaced. The first issue is the hardest, and this is where both computer vision and a platform agnostic chatbot can help!

What if you could just send a picture to a AI powered bot and it can identify product for you…

Train hard your model and harvest fruits with REST : Ahhhh!

Step 1: Implement the AI

First task was to implement an AI which could recognize objects. Creating such an image recognition system, requires data. We collected this data for household products as videos of each object. We then converted videos to images, and trained on our neural network to identify household objects in real-time. We used transfer learning and fine-tuning techniques with some tweaks and were able to identify objects correctly 9 out of 10 times.

Transfer learning is a technique where one can obtain features from neural network trained on large real-world data like Image-net, and use those features to train on a small data like our case of household products.

You can find a nice blog over how to perform transfer learning from the author of Keras, the famous deep learning library, here:

Building powerful image classification models using very little dataBut what's more, deep learning models are by nature highly repurposable: you can take, say, an image classification or…blog.keras.io

Step 2: Deploy the AI

The next task was an important one, to provide our trained deep learning model as a ready-to-use web API. We deployed our model on Google Cloud Platform. All we needed then was a REST call to access our API. More details on how one could do this can be found on a nice article from Rahul:

Tensorflow + Docker = Production ready AI productEveryone is talking about training the Deep Learning models and fine tuning them but very few talks about the…medium.com

Glimpse : Chatbot Design Flow

Step 3: Design the chatbot UX

It is a very important task to design the bot well, as its an interface directly being used by the end-users. Designing a chatbot as ImagoBuddy, is something that starts all the way back with defining the use-case and understanding context and users, which our Conversational UX designers Simon and Patrick have described in detail here:

Designing for AI — why & howWhat I’ve learned (so far) about design and it’s necessity when developing AI products.uxdesign.cc

ImagoBuddy lives on Facebook Messenger, where we used the Messenger Platform APIs to make our bot accessible anywhere over internet.

Once the bot design is ready, we needed to just make a REST call from our bot to our web-based API whenever needed to get real-time response.

Step 4: Pushing chatbots forward with NLP and Computer-Vision

Finally, we presented our ImagoBuddy at Innovation Roundtable Summit 2017 in Copenhagen, which was well received by all the participants.

Challenges

As nothing comes on easy, we also had technical challenges which we improved in later versions of ImagoBuddy. One of the major challenge was to handle situations when user sends a random image to our chatbot which do not belong to any covered household category. This was tackled by training our neural network classifier for a “none” category.

What’s next?

Our next ambition is to create bots which can do stuffs like sign language reading, detecting objects and understand music. I wrote this small article to give a glimpse of how such a bot was created at Botsupply and how you can create one.

Thanks for going through this article! If you need another article with more details, please contact me on saurabh@botsupply.ai.