How we enhance AWS SageMaker Object Detection with “Mandarins”
Object detection is an AI model which is used to locate objects in an image. From detecting human faces, cars and also for medical examination such as detecting tumors.
In this blog we are sharing a trick how to enhance the accuracy of AWS SageMaker Object Detection algorithm by supplying negative samples utilizing its built-in multiclass support
carsales retail portal facilitates buying and selling cars. There are approximately 250,000 cars in our platform and there are 5,000 new ads submitted a day. To keep the quality of the ads high, the customer support team (CST) manually review and approve each ad. Recently carsales deployed an AI technology called Tessa which helps to automate the approval process which used to take CST 3.5 hours down to merely 7 seconds. This also dramatically improve consumer experience.
Tessa has many rules in place to approve an ad and one of them is to make sure that there is at least one photo with a visible rego plate. Using existing rego recognition services won’t help much with challenging photo conditions such as when the rego plate angle is too steep or when the lighting is poor resulting in lots of miss-detection. We built an AI to detect rego plate to overcome this issue using AWS SageMaker.
Amazon SageMaker provides every developer and data scientist with the ability to build, train, and deploy machine learning models quickly. It is a fully-managed service that covers the entire machine learning workflow which labels and prepares your data, chooses an algorithm, trains the model, tunes and optimizes it for deployment, makes predictions, and takes action.
The data required for this project consists of car images (with visible rego plate) and a CSV file describing metadata information such as the bounding box location of the rego plate in each image. We downloaded around 11,000 car images from our database. However, they are mixed photos of cars from various angles in which some do not show rego plate such as GPS infotainment, engines, boot, etc.
Obviously, our first step is to get rid of images of cars which do not show rego plate. For this we utilize our AI tech, Cyclops.
Cyclops can classify car images into 27 categories like the boot, passenger seat, side mirror, dashboard, full rear, and full front with 97.2% accuracy.
We used Cyclops to categorize and to remove 9,500 images of cars without rego plate leaving us with only 1,500 images. Next is the unavoidable job to manually label the 1,500 images although the workload has been already dramatically reduced.
We then split the 1,500 images into 1,300 training and 200 validation and uploaded these images into our S3 bucket together with a JSON file which is a formatted version of our CSV file to satisfy AWS SageMaker input requirement.
Building the Model
Out of many built-in algorithms provided by AWS SageMaker, Object Detection is the most obvious choice for our case. We created a Jupyter notebook and started to build our training script. Thanks to AWS SageMaker which provided an object detection training script as a starting point, so that we simply just need to modify the S3 bucket location where our training set is stored and the output location of the model.
In less than 5 minutes, we already started our training job. We continuously monitored the training progress from the CloudWatch log. Training was completed after an hour using ml.p2.16xlarge instance and we were getting a validation accuracy of 93.5%
We created the model endpoint which is as simple as just executing one line of code and that’s it, we had an API end point to call, ready for inference serving. It toke us around 1.5 weeks to get to this point where majority of the times was spent in a data preparation. This is really a game changer knowing that in our experience, building an end to end AI tech like this toke us at least 2 months.
Testing the Model
As a further verification, we tested our model against 1500 car images of various angles. To our surprise we were only getting an accuracy of 20% which is way lower than the validation accuracy of 93.5% we were getting from the training. A more detailed confusion metrics analysis showed that the false positive error rate (model says that there is a rego plate while actually not) is sitting pretty high at 80% which was the primary contributor to the accuracy drop. Even when increasing the confidence score threshold to 0.5, sacrificing recall we were still getting a pretty high false positive at 4%. This is not acceptable as a false positive causes an ad without a visible rego plate to be approved, jeopardising our approval process. In contrary, a false negative is not really a deal breaker as all it means is just the ad will be sent to our customer support team to verify.
We also noticed that false positive were happening more frequently on images like dashboard and GPS infotainment where there were lots of objects which looks like rego plate. Validation accuracy during training didn’t show this as our validation set did not contain car images without rego plate. With the facts above, we hypothesized that despite our model did a great job at detecting the location of a rego plate given there was one, it was easily mistaken to think that there is a rego plate when there was none.
We realized our mistake. All our training set did not have images of car without rego plate, or the proper terminology is negative samples. We should train our model with a balanced mixed of positive and negative samples. This way, the AI will learn to ignore object which looks like a rego plate.
We expanded our training set to include a balanced number of images of car with and without rego plate and restarted our training, our quest was cut short as this was not possible. AWS SageMaker requires all images to have at least one bounding box. A negative image (image without rego plate) won’t have any bounding box.
So, what to do? While we were banging our heads in despair, we saw a mandarin 🍊 sitting in the corner of our desk and trigger a light bulb moment. We just need a way to include images without rego plate in our training set right?
SageMaker Object Detection Algorithm allows training with multiple classes hence we decided to train with two classes: Rego Plate and Mandarin. In the image when there is no rego, we digitally put a mandarin. Now every image has a bounding box and SageMaker is happy.
With very high hope, we restarted the training and test the new model. We ran the test again and the assessment on confusion metrics was showing a much better false positive error rate at 20% as oppose to 80% previously. Furthermore the error rate at 0.5 confidence score threshold is merely 0.8% compare to previously at 4% (measured at 0.5 confidence score threshold)
You can also see from the precision and recall curve above, the intersection between precision and recall for our new model is sitting at 0.87 which is much better than the old model at 0.8.
AWS SageMaker expedites the process of building AI without us worrying too much on building infrastructure. It dramatically reduced the development cost which normally takes around 2 months down to 2 weeks and without the need of a deep technical expertise. We have also proven that it is possible to supply negative samples to Object Detection Algorithm by utilizing the multiclass support, enhancing the capabilities of an already awesome tools. Though, we hope that this capability will be natively supported in the future update.
I would also like to give credit to Eric Yuxuan Lin, an AI Software Engineer in our team who worked on this project.