How we enhance AWS SageMaker Object Detection with “Mandarins”

Business Case

carsales retail portal facilitates buying and selling cars. There are approximately 250,000 cars in our platform and there are 5,000 new ads submitted a day. To keep the quality of the ads high, the customer support team (CST) manually review and approve each ad. Recently carsales deployed an AI technology called Tessa which helps to automate the approval process which used to take CST 3.5 hours down to merely 7 seconds. This also dramatically improve consumer experience.

Data Preparation

The data required for this project consists of car images (with visible rego plate) and a CSV file describing metadata information such as the bounding box location of the rego plate in each image. We downloaded around 11,000 car images from our database. However, they are mixed photos of cars from various angles in which some do not show rego plate such as GPS infotainment, engines, boot, etc.

Images of cars from various angles
Cyclops Tech

Building the Model

Out of many built-in algorithms provided by AWS SageMaker, Object Detection is the most obvious choice for our case. We created a Jupyter notebook and started to build our training script. Thanks to AWS SageMaker which provided an object detection training script as a starting point, so that we simply just need to modify the S3 bucket location where our training set is stored and the output location of the model.

Jupyter Notebook
Monitoring validation and training accuracy via CloudWatch

Testing the Model

As a further verification, we tested our model against 1500 car images of various angles. To our surprise we were only getting an accuracy of 20% which is way lower than the validation accuracy of 93.5% we were getting from the training. A more detailed confusion metrics analysis showed that the false positive error rate (model says that there is a rego plate while actually not) is sitting pretty high at 80% which was the primary contributor to the accuracy drop. Even when increasing the confidence score threshold to 0.5, sacrificing recall we were still getting a pretty high false positive at 4%. This is not acceptable as a false positive causes an ad without a visible rego plate to be approved, jeopardising our approval process. In contrary, a false negative is not really a deal breaker as all it means is just the ad will be sent to our customer support team to verify.

False Positive error rate at various confidence score thresholds
Odometer counter was mistakenly identified as a rego plate with a high confidence score of 0.985
Positive sample (left), Negative sample (right)

Solution

We expanded our training set to include a balanced number of images of car with and without rego plate and restarted our training, our quest was cut short as this was not possible. AWS SageMaker requires all images to have at least one bounding box. A negative image (image without rego plate) won’t have any bounding box.

Random size mandarin is placed at random location in images when there is no rego
False positive error rate comparison between models at various confidence score threshold
Precision and Recall curve comparison between models at various confidence score thresholds

Summary

AWS SageMaker expedites the process of building AI without us worrying too much on building infrastructure. It dramatically reduced the development cost which normally takes around 2 months down to 2 weeks and without the need of a deep technical expertise. We have also proven that it is possible to supply negative samples to Object Detection Algorithm by utilizing the multiclass support, enhancing the capabilities of an already awesome tools. Though, we hope that this capability will be natively supported in the future update.

Credits

Finally, we would like to say thanks to Aparna Elangovan and Julian Bright from AWS who helped us familiarize with AWS SageMaker especially in building the CI/CD pipeline.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store