Finding the Right Data: An Interview with the Founders of Mellow
Keymakr is supporting innovative machine learning projects by providing diverse, high-quality image and video datasets. One of these collaborations is with Mellow, a machine learning platform for small and midsize businesses. We talked to Mellow Co-Founders Nathan Collins and Ankur Patel to find out more about their project. Answers have been combined and edited for clarity.
Tell us a little bit about Mellow.
Mellow is a fully-managed, automated machine learning platform that develops and deploys AI models quickly and affordably for SMBs. Our machine learning, natural language processing, and computer vision models can be used for a variety of automated data solutions including, fraud detection, transaction classification, named entity resolution, credit scoring, data quality control, customer segmentation, and object identification.
It requires a lot of investment in in-house talent and resources to maintain and monitor an ML model. Our platform can take data, labels and instructions and then run a series of models according to whichever template is best suited for the task. Once a successful model has been trained results can be shared via an API. Clients don't need ML expertise, just data. The Mellow model reduces workload for companies.
What is the project you have been working on with Keymakr?
The goal of the Mellow-Keymakr collaboration was to demonstrate advanced computer vision technologies. Keymakr supplied thousands of labeled images of people wearing and not wearing masks.
Mellow used those images to build a custom video monitoring solution on its computer vision model. This custom solution identifies if a person in a video is wearing a mask and what percentage of the mask is applied correctly.
The ultimate aim was to build a face detection application and a spatial distancing application for use in commercial settings in an effort to return workforces and tenants to COVID-compliant settings.
Why was this project important?
Technology solutions like this have tremendous value in helping control and monitor the spread of Covid-19. Further, this technology can be used for a wide range of other applications for other industries like agriculture, construction, public safety, and manufacturing, to name just a few.
What challenges did you face during the development of this application?
Through continuing testing during development some problems were discovered. Most importantly the model was not improving without diverse data. In the wild you are going to get data that is a lot more noisy. For example: different environments, facial hair such as beards, facial coverings such as hijabs, skin color, hair length, non-compliant mask wearing, different styles of mask. The model needs to be able to separate and learn from all these different contexts.
How were you able to overcome this issue?
Keymakr was able to source images that addressed the lack of diversity in the data and helped us build a robust computer vision application. Keymakr separated the data into categories, such as ethnicity and gender. This allowed us to easily change and refine the model.
Machine learning finds the nuances. Our model was given training images to learn from, separated into “masks” and “not masks” and annotated with bounding boxes. Not properly fitting masks were put into the “not masks” set. The model was able to correctly parse the differences between these subtly differentiated images. It was then further challenged with CCTV video. Video offers a realistic perspective and a rigorous test for the performance of the application.
Based on your experience what are the key advantages of outsourcing image data collection?
Sourcing images by, for example, crawling the public web, can be prohibitively time consuming and difficult. There are also data rights implications to be considered, and this can often be a problematic gray area. Often the curated datasets necessary for the development of a successful model can’t be found from the public web and need to be sourced by providers with data collection expertise.
Annotations are also difficult to accomplish in-house. It is therefore essential to leverage external providers, like Keymakr, as a source of both data collection and annotation services.
Data collection and annotation for computer vision
Keymakr is working with innovators and leaders in machine learning and AI. Take advantage of our data collection expertise, proprietary annotation tools, and managed annotation team. Contact a team member to book your personalized demo today.