How Data Collection Supports Computer Vision Projects

Data collection is a vital step in the machine learning for computer vision process. Before image and video annotation can take place, raw data needs to be assembled. This data needs to be of the quality, variety, and quantity required for functional training data. To achieve this developers often make use of open source image repositories, or they deploy web scraping tools to trawl the internet for appropriate data.

However, both approaches require a significant investment of time and expertise that has the potential to distract from a company's core mission. Professional annotation providers, like Keymakr, offer data collection services that save companies money and reliably assemble high quality images and video. Keymakr achieves through a combination of annotation experience and state of the art data scraping tools.

This blog will look at five different cases where Keymakr made use of these proprietary tools to collect the data that fulfilled their client’s needs.

Weather prediction

The need: The client was developing a weather prediction model that would be able to analyse water levels and give early warning of potential floods. This required training an algorithm with images of flooded areas from a variety of locations.

The solution: Keymakr was able to assemble over 10,000 images of inundated landscapes, covering multiple geographic areas and weather conditions. These images were then annotated with information about the water level, e.g. what degree of flooding the area in question was receiving.

Face recognition

The need: The client required images of faces that reflect the diversity of human appearances, across ethnicities and genders. This would allow the facial recognition algorithm to correctly identify individuals from images without bias or error.

Video annotation | Keymakr

The solution: By utilising data scraping tools Keymakr was able to collect over 50,000 images. The data corresponded with the diversity of the real world, and featured a representative mixture of ethnicities and genders.

Medical Imaging

The need: The client was developing a machine learning algorithm that could identify brain tumours in MRI scans. To achieve this training images were required that correctly showed developing tumours.

The solution: For this project, dealing with sensitive medical information, accuracy was essential. To achieve this Keymakr partnered with hospitals for data collection, before having the annotation work carried out and verified by medical professionals.

Food industry

The need: The client was developing smart fridges that are able to identify produce and warm customers when they have expired. This meant training the algorithm with images of a wide variety of food in different states of freshness.

The solution: Keymakr’s team used their data collection expertise to assemble 20,000 images of different products. These food items were classified by categories before being annotated to reflect their condition.

Retail

The need: The client wanted to create an algorithm for ecommerce. This algorithm was designed to identify different clothing items and classify them. This would allow for easy organisation of large online catalogues, as well as facilitating effective searches and recommendations for customers.

The solution: Keymakr used data collection technology to create a large dataset of over 100,000 images. These images covered an enormous variety of clothing types, providing the variety and depth that the algorithm required.

Data collection outsourcing supports AI innovators

Keymakr is a professional annotation service with expertise in data collection and annotation. By utilizing proprietary technology and experienced teams of annotators Keymakr ensures that computer vision projects get the data they need.

Contact a team member to book your personalized demo today.