# P1:M1 Masyu Puzzle Master Problem - Judging and Scoring Criteria & Methods for Selecting Our Winners

## P1:M1 Masyu Puzzle Master Problem - Judging and Scoring Criteria & Methods for Selecting Our Winners

Threading Challenge 2011 - Masyu Puzzle, Master Problems 1: Judging & Scoring Criteria and Methods

As of June 21, 2011, we announced the winners for Master Problem 1, Masyu Puzzle. Our group of Judges used the following judging and scoring criteria and methods for selecting the winners. In addition, you will find a link to the testing results and the scores further below.

Master Level Problem Set 1 (P1:M1) Masyu PuzzleKey Scoring Principles

Basic scoring principles used for the contest entries judging are described at the official rules page. Here is a short summary: each contest entry was scored according to the following criteria: 1) up to 100 points for solutions performance (speed); 2) up to 50 points for the solutions write-up using the 10-30-10 breakdown of points for a serial algorithm description, parallel algorithm description and performance analysis respectively; 3) a maximum of 25 bonus points for a contestants activity in the forum, calculated as 5 bonus points for each valid forum post/reply.

Input Data Sets Used for Performance Scoring

Ten different input data sets were used to compute the execution score for this problem. The simplest one is a 7x7 Masyu puzzle given as an example in the problem set description. The hardiest is a dense 40x36 puzzle. Full archive of input data sets can be downloaded here.

Points in Performance Scoring

Each input data set was weighted equally and was judged individually based on a ranking scheme. The overall performance score was calculated as a sum of all ten input data sets individual points.

We allowed a total of 120 seconds execution (2 minutes) maximum for each input set; for those runs that took longer than 120 seconds or had runtime errors during execution, zero performance points were awarded. Some entries that could not be built on the MTL got zero points as well.

Successful contest entries that solved test puzzles in less than 2 minutes were ranked based on their execution time and got performance points according to a reciprocal rank-weight scale. For example, the fastest solution[s] of a data set got 10 points, next solution got 5 points, then 3.33 points and so on and so forth.

We received 16 contest entries in the Master Level Masyu Puzzle problem set.

Only one entry successfully solved all 10 puzzles. Another entry solved 9 out of 10. Two entries solved 6 and 5 puzzles each. Four entries were able to solve 3 puzzles each; three entries 2 puzzles. Finally, three entries were incomplete and therefore unable to solve any puzzle.

All the timings and performance points are available in the final Masyu scoring table below.

Write-ups Scoring

The write-up portion of each entry was read and scored by several judges. Each judge used the 10-30-10 breakdown of points for serial algorithm description, parallel algorithm description, and performance, respectively. One important component to the judging was to determine how close the submission was for publication on ISN. The assigned score is available in the final Masyu scoring table below.

Forum Activity and Bonus Points

Bonus points were given for contestants forum posts made before the problem entries were closed. Five points per post (maximum 25 points possible) were awarded.

Entry points and penalties.

Each contest entry got 100 entry points. A penalty of 50 points was taken off in case the entry is not able to solve simple Masyu puzzle given as an example in the problem set description.

Winners

The Masyu puzzle winners based on highest point total are the following:

1) john_e_lilley

2) Rui Diao

3) akki

These three contestants provided the solutions which resolved the maximum number of our test puzzles. They also had the fastest overall code execution and well-written write-ups.

Access Judging Score Card for Masyu Puzzle, Master Problem 1 Updated July 9, 2011

11 posts / 0 new
For more complete information about compiler optimizations, see our Optimization Notice.

Congratulations John! That's an amazing result.

Thanks judges. Looks like it was a lot of work. This is the most detailed scoreboard I've seen on Intel Threading Challenge yet. I'm glad, you've decided to disclose all the scores this year.

Congratulations John!

Sorry, the hyperlink to the data sets got lost when we copied the criteria textto the forum. That is now fixed and LIVE.

I checked the test data

02_do_easy_1010.txt
03_do_med_1018.txt
09_do_hard_2036.txt
10_do_crazy_4036.txt
Error Each line of the block only contain8 integers

06_pjbpuz03.txt
Erroronly one blocks
Need
B
0 0

Hope to re-test after correction thanks.

Quoting akki
.Congratulations John! That's an amazing result.

Thanks akki! This was an incredibly challenging problem; I think I went through four or five total redesigns. My program has much more trouble with cases 06 and 08 than 10 because of the sparse nature of the boards, and it was kind of lucky to choose the correct paths on those cases. That's the one thing where the extra cores really helped in the MTL, was being able to explore multiple paths simultaneously so that if 39 are dead ends it might still find the answer on the 40th.

Quoting Jeff Kataoka (Intel)

Threading Challenge 2011 - Masyu Puzzle, Master Problems 1: Judging & Scoring Criteria and Methods...

Access Judging Score Card for Masyu Puzzle, Master Problem 1

I just noticed something. I knew something was out of place when the score card was first posted but wasn't sure what.

The 3rd sheet of the scorecard contains hidden columns that show the scores corresponding to individual times. These scores for the first problem are as follows:

TimeScore

0.003
2.50

0.200
1.11

0.012
1.67

0.001
10.00

0.001
10.00

0.001
10.00

0.016
1.43

0.360
1.00

0.024
1.25

0.003
2.50

0.600
0.91

5.900
0.77

0.889
0.83

120.000
0.00

120.000
0.00

120.000
0.00

which means, the penalty for having the second fastest solution is extremely severe. For example, a solution that is 0.002 seconds slower than the first costs 7.5 points whereas one that is 5.897 seconds slower than the second costs just about 1.73 points!

Does this really make sense?!

Quoting akki
....the penalty for having the second fastest solution is extremely severe. For example, a solution that is 0.002 seconds slower than the first costs 7.5 points whereas one that is 5.897 seconds slower than the second costs just about 1.73 points!
Does this really make sense?!

Hi Akki!
This is really good question.

First of
of the score card here.

Back to
rank-weight scale, and the key word here is rank. In other words, we provided
the points according relative position in ranking but not the seconds
themselves. Why? We wanted to reward algorithms which demonstrate the best
performance on various data sets. And in the same time we wanted to especially
reward the top performers.

In our
ranking system top 3-5 performers get significant points - 10, 5, 3.3 or 2.5 depends
on their results. Others get less than 2 points.

BTW: the results
of the 1st test data set are specific. The data set itself is tiny,
and the only purpose of it is to make sure that the application passes a sanity
check.

Quoting I checked the test data

02_do_easy_1010.txt
03_do_med_1018.txt
09_do_hard_2036.txt
10_do_crazy_4036.txt
Error Each line of the block only contain8 integers

06_pjbpuz03.txt
Erroronly one blocks
Need
B
0 0

Hope to re-test after correction thanks.

I'll have the judges look into this.

Quoting Dmitry Oganezov (Intel)

Back to
rank-weight scale, and the key word here is rank. In other words, we provided
the points according relative position in ranking but not the seconds
themselves. Why? We wanted to reward algorithms which demonstrate the best
performance on various data sets. And in the same time we wanted to especially
reward the top performers.

In our
ranking system top 3-5 performers get significant points - 10, 5, 3.3 or 2.5 depends
on their results. Others get less than 2 points.

Thanks for the clarification.

Say there are 3 participants, a, b and c

times for problem 1:
a: 0.001, 0.001, 0.001, 0.001
b: 0.002, 0.002, 0.002, 0.003
c: 0.003, 0.003, 0.003, 0.002

scores for problem 1:
a: 10 + 10 + 10 + 10 = 40
b: 5 + 5 + 5 + 3.33 = 18.33
c: 3.33 + 3.33 + 3.33 + 5 = 15

times for problem 2:
a: 0.500, 0.500, 0.500, 0.500
b: 0.001, 0.001, 0.001, 0.001
c: 6.000, 6.000, 6.000, 6.000

scores for problem 2:
a: 5 + 5 + 5 + 5 = 20
b: 10 + 10 + 10 + 10 = 40
c: 3.33 + 3.33 + 3.33 + 3.33 = 13.33

Totals:
a: 60
b: 58.33
c: 28.33

Clearly, the quality of b's algorithm for problem 2 is far superior to a's algorithm. But it is impossible for b to make a comeback because the first algorithm was marginally inferior. In this case, is the top performer across problems, truly rewarded?

IMHO, a competition that evaluates performance should reward performance rather than penalize rank.

BTW: the results
of the 1st test data set are specific. The data set itself is tiny,
and the only purpose of it is to make sure that the application passes a sanity
check.

Why penalize a solution for marginally poor performance in a sanity check, for which it probably isn't even optimized? 7.5/10 points in this case, that too for a difference of 2ms when solutions times range from 1ms to 5900ms is a huge penalty!