P1:M1 Masyu Puzzle Master Problem - Judging and Scoring Criteria & Methods for Selecting Our Winners

P1:M1 Masyu Puzzle Master Problem - Judging and Scoring Criteria & Methods for Selecting Our Winners

Ritratto di Jeff Kataoka (Intel)
Threading Challenge 2011 - Masyu Puzzle, Master Problems 1: Judging & Scoring Criteria and Methods As of June 21, 2011, we announced the winners for Master Problem 1, Masyu Puzzle. Our group of Judges used the following judging and scoring criteria and methods for selecting the winners. In addition, you will find a link to the testing results and the scores further below.

Master Level Problem Set 1 (P1:M1) Masyu Puzzle Key Scoring Principles

Basic scoring principles used for the contest entries judging are described at the official rules page. Here is a short summary: each contest entry was scored according to the following criteria: 1) up to 100 points for solutions performance (speed); 2) up to 50 points for the solutions write-up using the 10-30-10 breakdown of points for a serial algorithm description, parallel algorithm description and performance analysis respectively; 3) a maximum of 25 bonus points for a contestants activity in the forum, calculated as 5 bonus points for each valid forum post/reply.

Input Data Sets Used for Performance Scoring

Ten different input data sets were used to compute the execution score for this problem. The simplest one is a 7x7 Masyu puzzle given as an example in the problem set description. The hardiest is a dense 40x36 puzzle. Full archive of input data sets can be downloaded here.

Points in Performance Scoring

Each input data set was weighted equally and was judged individually based on a ranking scheme. The overall performance score was calculated as a sum of all ten input data sets individual points.

We allowed a total of 120 seconds execution (2 minutes) maximum for each input set; for those runs that took longer than 120 seconds or had runtime errors during execution, zero performance points were awarded. Some entries that could not be built on the MTL got zero points as well.

Successful contest entries that solved test puzzles in less than 2 minutes were ranked based on their execution time and got performance points according to a reciprocal rank-weight scale. For example, the fastest solution[s] of a data set got 10 points, next solution got 5 points, then 3.33 points and so on and so forth.

Execution Results and Point Spread

We received 16 contest entries in the Master Level Masyu Puzzle problem set.

Only one entry successfully solved all 10 puzzles. Another entry solved 9 out of 10. Two entries solved 6 and 5 puzzles each. Four entries were able to solve 3 puzzles each; three entries 2 puzzles. Finally, three entries were incomplete and therefore unable to solve any puzzle.

All the timings and performance points are available in the final Masyu scoring table below.

Write-ups Scoring

The write-up portion of each entry was read and scored by several judges. Each judge used the 10-30-10 breakdown of points for serial algorithm description, parallel algorithm description, and performance, respectively. One important component to the judging was to determine how close the submission was for publication on ISN. The assigned score is available in the final Masyu scoring table below.

Forum Activity and Bonus Points

Bonus points were given for contestants forum posts made before the problem entries were closed. Five points per post (maximum 25 points possible) were awarded.

Entry points and penalties.

Each contest entry got 100 entry points. A penalty of 50 points was taken off in case the entry is not able to solve simple Masyu puzzle given as an example in the problem set description.

Winners

The Masyu puzzle winners based on highest point total are the following:

1) john_e_lilley

2) Rui Diao

3) akki

These three contestants provided the solutions which resolved the maximum number of our test puzzles. They also had the fastest overall code execution and well-written write-ups.

Access Judging Score Card for Masyu Puzzle, Master Problem 1 Updated July 9, 2011

11 post / 0 new
Ultimo contenuto
Per informazioni complete sulle ottimizzazioni del compilatore, consultare l'Avviso sull'ottimizzazione
Ritratto di protocolocon

Where can we download input data sets?

Ritratto di akki

Congratulations John! That's an amazing result.

Thanks judges. Looks like it was a lot of work. This is the most detailed scoreboard I've seen on Intel Threading Challenge yet. I'm glad, you've decided to disclose all the scores this year.

Ritratto di 邓辉

Congratulations John!

写字楼里写字间,写字间里程序员 程序人员写程序,又拿程序换酒钱 酒醒只在网上坐,酒醉还来网下眠 酒醉酒醒日复日,网上网下年复年
Ritratto di Jeff Kataoka (Intel)

Sorry, the hyperlink to the data sets got lost when we copied the criteria textto the forum. That is now fixed and LIVE.

Ritratto di 邓辉

I checked the test data

02_do_easy_1010.txt
03_do_med_1018.txt
09_do_hard_2036.txt
10_do_crazy_4036.txt
Error Each line of the block only contain8 integers

06_pjbpuz03.txt
Erroronly one blocks
Need
B
0 0

Hope to re-test after correction thanks.

写字楼里写字间,写字间里程序员 程序人员写程序,又拿程序换酒钱 酒醒只在网上坐,酒醉还来网下眠 酒醉酒醒日复日,网上网下年复年
Ritratto di john_e_lilley
Quoting akki .Congratulations John! That's an amazing result.

Thanks akki! This was an incredibly challenging problem; I think I went through four or five total redesigns. My program has much more trouble with cases 06 and 08 than 10 because of the sparse nature of the boards, and it was kind of lucky to choose the correct paths on those cases. That's the one thing where the extra cores really helped in the MTL, was being able to explore multiple paths simultaneously so that if 39 are dead ends it might still find the answer on the 40th.

Ritratto di akki
Quoting Jeff Kataoka (Intel)
Threading Challenge 2011 - Masyu Puzzle, Master Problems 1: Judging & Scoring Criteria and Methods

...

Access Judging Score Card for Masyu Puzzle, Master Problem 1

I just noticed something. I knew something was out of place when the score card was first posted but wasn't sure what.

The 3rd sheet of the scorecard contains hidden columns that show the scores corresponding to individual times. These scores for the first problem are as follows:

Time Score 0.003 2.50 0.200 1.11 0.012 1.67 0.001 10.00 0.001 10.00 0.001 10.00 0.016 1.43 0.360 1.00 0.024 1.25 0.003 2.50 0.600 0.91 5.900 0.77 0.889 0.83 120.000 0.00 120.000 0.00 120.000 0.00

which means, the penalty for having the second fastest solution is extremely severe. For example, a solution that is 0.002 seconds slower than the first costs 7.5 points whereas one that is 5.897 seconds slower than the second costs just about 1.73 points!

Does this really make sense?!

Ritratto di Dmitry Oganezov (Intel)
Quoting akki ....the penalty for having the second fastest solution is extremely severe. For example, a solution that is 0.002 seconds slower than the first costs 7.5 points whereas one that is 5.897 seconds slower than the second costs just about 1.73 points!
Does this really make sense?!

Hi Akki!
This is really good question.

First of
all those columns were hidden unintentionally. You can download unhidden version
of the score card here.

Back to
your question about timing, points and penalties. We used reciprocal
rank-weight scale, and the key word here is rank. In other words, we provided
the points according relative position in ranking but not the seconds
themselves. Why? We wanted to reward algorithms which demonstrate the best
performance on various data sets. And in the same time we wanted to especially
reward the top performers.

In our
ranking system top 3-5 performers get significant points - 10, 5, 3.3 or 2.5 depends
on their results. Others get less than 2 points.

BTW: the results
of the 1st test data set are specific. The data set itself is tiny,
and the only purpose of it is to make sure that the application passes a sanity
check.

Ritratto di Jeff Kataoka (Intel)
Quoting I checked the test data

02_do_easy_1010.txt
03_do_med_1018.txt
09_do_hard_2036.txt
10_do_crazy_4036.txt
Error Each line of the block only contain8 integers

06_pjbpuz03.txt
Erroronly one blocks
Need
B
0 0

Hope to re-test after correction thanks.

I'll have the judges look into this.

Ritratto di akki
Quoting Dmitry Oganezov (Intel)

Back to
your question about timing, points and penalties. We used reciprocal
rank-weight scale, and the key word here is rank. In other words, we provided
the points according relative position in ranking but not the seconds
themselves. Why? We wanted to reward algorithms which demonstrate the best
performance on various data sets. And in the same time we wanted to especially
reward the top performers.

In our
ranking system top 3-5 performers get significant points - 10, 5, 3.3 or 2.5 depends
on their results. Others get less than 2 points.

Thanks for the clarification.

Say there are 3 participants, a, b and c

times for problem 1:
a: 0.001, 0.001, 0.001, 0.001
b: 0.002, 0.002, 0.002, 0.003
c: 0.003, 0.003, 0.003, 0.002

scores for problem 1:
a: 10 + 10 + 10 + 10 = 40
b: 5 + 5 + 5 + 3.33 = 18.33
c: 3.33 + 3.33 + 3.33 + 5 = 15

times for problem 2:
a: 0.500, 0.500, 0.500, 0.500
b: 0.001, 0.001, 0.001, 0.001
c: 6.000, 6.000, 6.000, 6.000

scores for problem 2:
a: 5 + 5 + 5 + 5 = 20
b: 10 + 10 + 10 + 10 = 40
c: 3.33 + 3.33 + 3.33 + 3.33 = 13.33

Totals:
a: 60
b: 58.33
c: 28.33

Clearly, the quality of b's algorithm for problem 2 is far superior to a's algorithm. But it is impossible for b to make a comeback because the first algorithm was marginally inferior. In this case, is the top performer across problems, truly rewarded?

IMHO, a competition that evaluates performance should reward performance rather than penalize rank.

BTW: the results
of the 1st test data set are specific. The data set itself is tiny,
and the only purpose of it is to make sure that the application passes a sanity
check.

Why penalize a solution for marginally poor performance in a sanity check, for which it probably isn't even optimized? 7.5/10 points in this case, that too for a difference of 2ms when solutions times range from 1ms to 5900ms is a huge penalty!

Accedere per lasciare un commento.