Results of Hotels.ng Machine Learning Hackathon: 50% no-shows predicted correctly
Hotels.ng has released the results of the hackathon they recently held (about a machine learning algorithm to predict cancellations of bookings). Here is the result as published on their blog: http://blog.hotels.ng/2015/08/14/hotels-ngs-machine-learning-hackathon-50-of-all-no-shows-predicted-accurately/.
Summary of results:
— Andela wins with 50% successfully predicted cancellations
— MEST comes second with 33% successful predictions
— Segun from Truppr delivers a good result that gets disqualified due to technicalities
I understand the exercise as a binary classification task, where the prediction model should be able to tell whether a potential guest will cancel his/her reservation. A 50% accuracy means there’s a 50-50 chance that any reservation can/will be cancelled. This is a very poor results as far as I am concerned, and I fail to see how it can benefit hotel owners or the people at hotels.ng.
Naturally, there could be many reasons why the best accuracy is so low. Possibly the data was insufficient for the learning task, the wrong algorithm was used for the exercise, or the developers didn’t take/have enough time to understand the data deeply enough to build the right classifiers. That said, binary classification is a very trivial task and I believe that Mark would’ve gotten a better predictor if this competition had more participants. Simple.
This is clearly a case where the winning team is the best amongst the worse; but still not good enough. At the risk of being called a hater, I don’t think there’s any winner to this competition.
In a better set up, the competition should (a) challenge participants to beat a reasonable baseline, or (b) build predictors that have an accuracy above a given threshold.
I’ve made this comment solely based on my meager understanding of the problem they sought to solve. My argument will be pointless if the exercise wasn’t a binary classification task.
Be that as it may, congratulations to the winning team. It’s a thing of pride to see Nigerian companies and developers diving into AI-related problems and solutions for change.
Think of it this way: There are 5000 reservations. 500 will get cancelled. You predict that 250 out of those 500 will be cancelled. That’s 50% accuracy. If you just randomly guessed, you would guess 2500 wrong…
Is the current cancellation rate 10%?
Meanwhile, Segun Famisa – a developer from Truppr – wrote the actual best performing code, but it was disqualified due to technical issues.
Can @mark give more information on the issues that caused the disqualification.
Disclosure: Segun Famisa is my friend, but I have not had any convo with him regarding this.
His code was learning anew each time – so when we took it into production, instead of applying what it learnt on sample dataset on the new dataset, it would relearn from the new dataset, so obviously it would produce great predictions.
Source: Results of Hotels.ng Machine Learning Hackathon: 50% no-shows predicted correctly
Via: Google Alert for ML