It seems to me that the testing cluster does not include enough correctness checks. Effectively, a user could cheat and dramatically improve the performance of this application, at the expense of producing incorrect results in some corner cases. For example, for work_hard, one can merge the cheapest inbound flight with the cheapest outbound flight, without testing whether another combination might lead to a smaller total cost, due to discounts.
I have included a simple test-case in this post to illustrate this.
Regards,
Cristian.



