Roxana Roman

Enable your product to tell hot dogs from not hotdogs

If in the past Machine Learning techniques and Big Data processing tools were only available to the most resourceful companies, nowadays, even the small businesses can make use of them through out of the box APIs and cloud based solutions at a fair price. This is possible because of many years of machine learning research and hard work to develop the right models for the right domain.

Why did this process take so long? Well, that is because of the lack of large amounts of data, which at that time was replaced with human intuition.

As the world got more and more digitalised (through the internet and sensors), the lack of data was not a problem anymore and the human insights were replaced with lots of data. But the algorithms developed over the past years remain the key component of making deep learning successful. Nowadays, the challenge is to come up with a model that combines human intuition with the acquired data, incorporating the best of both.

Machine learning techniques Include features like: system recommendations, image recognition, chatbots, language translations, natural language processing, decision making or predictions. Their main goal is to make your product smarter by saving you time and money. In this article, I will define the steps needed to make use of this cutting edge technology, by walking you through a real-life example where we used it. The product we made smarter is Universe, a fully integrated G Suite social intranet.

1. Identify your business use case

The first step is to identify your need, or more exactly the part of your product that needs a machine learning algorithm or a data processing pipeline to fit your business use case.

In the Universe case, we wanted to improve and simplify the way clients interacted with the platform and to give them access to the right data at the right moment.

By analysing the way users collaborate with each other inside your platform, you can gain useful insights. For example, in an intranet platform where new content is displayed every day, employees can easily be overwhelmed by the quantity of data, making it difficult for them to be productive. In this case, a Recommender would ease the social collaboration by bringing up useful content closer to the user.

Natural language processing can be another business use case. If for a while now you could do anything in your phone with your voice, more recently you can use Google Speech API to control your home or your software products without the cumbersome keyboard input. This can give your products the so much wanted accessibility feature.

2. In data we trust

The next step is to identify and collect the kind of data you need. An extra process of cleaning your data can be needed at this point. For example, for Universe we needed to collect the interactions between users, the posts they liked, whose posts these belonged to and to what was the content of these posts related to, what documents users needed, for which meetings, etc. This enabled us to have enough data to  be able to feed an algorithm. Depending on your use case you will need different types of data. Here you can include anything from text, images, videos or voice.

3. Find your machine learning algorithm

Furthermore, once the data is prepared, you need to choose the proper algorithm. This can be a painful process, since you will most probably identify various machine learning algorithms that would fit your need. However, keep in mind  that the simplest algorithms work in the majority of the cases, they are easier to implement and maintain and most importantly they are easier to scale. Building the right machine learning model for your business is an iterative process where you get to try different similar algorithms and choose the one that best suits your case. For Universe we used the Louvain method, a simple, efficient and easy-to-implement method for identifying communities in large networks.

4. Be production ready

Last step is to build a production-ready solution. This process takes longer because you should observe how your users received your features how they interact with it and what can you improve and add to your statistical model.

All these steps have become much easier with the emergence of built in solutions. One core component of business intelligence is the data warehousing tool. An example is Google’s  BigQuery. This service is used for reporting and data analysis and its aim is to free you from the hassle of installing and maintaining the right infrastructure, allowing you to tap in the full power of Google’s infrastructure. To access algorithms used in voice/sound recognition, text-processing, sentiment analysis, flow detection, etc. you can use Google TensorFlow (or TensorFlow Lite for Android development). To train your models at scale, host them and make predictions, you can use Cloud Machine Learning Engine, which combines TensorFlow with Google’s IaaS.  Another useful tool is Google Cloud Dataproc. This offers an Apache Hadoop, Apache Spark, Apache Pig, and Apache Hive service, which help you to easily process big datasets at a low cost and with a high scalability.

So what are you waiting for? Have you identified how machine learning can make your product stand out from the crowd? We know your product is already good, but as John D. Rockefeller used to say:

Don’t be afraid to give up the good to go for the great.


Roxana Roman