Sharing of the sources

Sharing of the sources

Hi to all,

it would be nice to, after contest, we share ours sources, to learn from each other.

Is anybody interested?

Regards, Dusan.

63 posts / 0 nouveau(x)
Dernière contribution
Reportez-vous à notre Notice d'optimisation pour plus d'informations sur les choix et l'optimisation des performances dans les produits logiciels Intel.

Yeah, definitely!!
It was hard work and it would be cool if other people can benefit from it :)

Rock the bits!

Yes. And it's also interesting to know results of testing of our solutions on the cluster tests and on some tests from forum.

Great Idea! Can we do it via the Forum?

You can share your times if you want to :)

You can also provide your code if you want but be sure to remove all the references of your md5.

Our Best time for CompileScale6 was 21.87s (CPU usage:1142%). What are your results?

Nice parallelization ;)
I couldn't quite get that, cause I only parallelized part of the graph building and then the play hard propositions which all had to wait for the sequential dijstra used to solve the work hard problem...

CompileScale6 5.41 (CPU usage:275%)
...
CompileScale5 1.00 (CPU usage:151%)
CompileScale4 1.63 (CPU usage:63%)
CompileScale3 0.54 (CPU usage:49%)
CompileScale2 0.44 (CPU usage:17%)

Rock the bits!

CompileScale6 1.86 (CPU usage:79%)
CompileScale5 1.27 (CPU usage:48%)
CompileScale4 1.27 (CPU usage:45%)
CompileScale3 0.66 (CPU usage:27%)
CompileScale2 0.49 (CPU usage:25%)

@Slevin7: My first idea was to implement dijstra, too. But i dropped it because of problems with the discount. Would be interesting to see your sources!
@All: Awesome runtimes so far!

CompileScale6 2.46
CompileScale5 1.35
CompileScale4 1.17
CompileScale3 0.67
CompileScale2 0.33
CompileScale1 0.06

Yeah, the discounts where tricky, they cost me a night of brain crunching to finally get the right idea. First I tried to change the algorithm rules and introduce conditions for already taken discounts, saved in the nodes, but that got rather confusing and I was running in circles. And then while trying to fall asleep I had an idea that reminds me on Occam's Razor -> let the Dijkstra do it's primitive work on a graph that implements the rules by it's structure. So I created a hyper (flight-)node for each discount type and only added the valid flights as connections. That increases the size of the graph, but enables the greedy Dijkstra to work right. I did the same thing to distinguishing between vacation before or after the conference. So 6 nodes per flight but not much extra work on searching. And since the nodes don't contain any condition chosen by the walker the graph is static therefore reusable and can be accessed by multiple threads at once.

Rock the bits!

We used dijkstra as well, but this test was hard for it (even when it solved all scenarios and all compilescales right). Does your algorithm solve it well? (script is the same as in scenario1 except Home and Conf cities):
1;Home;11012012050000;Conf;11012012173000;1100;F1
2;Home;11012012050000;Conf;11012012173000;1000;F2
5;Conf;11082012070000;AB;11082012143000;1;F1
6;AB;11082012170000;Home;11082012193000;1;F1

Also dijkstra may find solution with repeating cities (if you use flights as vertices), how do you deal with it?

Finally we wrote dynamic programming algorithm, our time on CompileScale6 is 5.05 s, but its seems that almost all time is taken by parsing. How do you parse so fast?

Well, parsing doesn't take that long, I map the file to ram, then I parse it (I thought about doing it in parallel but I had no time left): First of all (but you've probably done it) use a different technique to convert the time stamp (eg boost libraries, as I suggested in the Flight times thread). Then there are some shortcuts you can take, as soon as you parsed the take_off_time, check if the flight is valid at all (I have the feeling 90% are out of range anyways) then just advance 37 bytes (16 + C + ; + 16 + P + ; + A) and search for the end of the line. You can use strchr for that as long as you don't expect that the end is close.. I assumed I only have to bother for that if (end - it < 1000). There are faster methods to find stuff in strings (http://www.strchr.com/sse2_optimised_strlen?allcomments=1), but I didn't bother to implement them, due to lack of time. And I am copying only the stuff I really need to copy. For example, atof() parses floats and stops at the ";" so that you don't have to copy anything just give it the start position and it will do its thing. For the rest just skip as many bytes as you can, that worked reasonably good.

I will check your scenario, it looks pretty similar to the one that made my head hurt ^^

Rock the bits!

How much time does it take you to run the 10M flight test you provided? Also, how do you solve the cycles problem with dijkstra ?

Hmm... I couldn't run it, because I have only my 4yr old laptop with 4 GB RAM...
I don't have cycles. You can build a tree from the provided list of flights, because you can't go back in time, so to any flight you only add connecting flights that are after your current flight, in the allowed time frame of land_time + max_lay_over (respectively conference vacation times and stuff). There is no way to take the same flight twice. I checked for the visiting cities twice problem by having a vector (size of number of cities) with bool values that are true for those cities I already visited.

Rock the bits!

Hi everyone,

Even though we might not be in the race at all, here's our idea. We used dijkstra with the prioriry queue in our final solution. We lost most of the time (all but the last 3-4 days) adapting the problem to dijkstra adn ended up making a separate node for each incomimg flight resulting in num_of_nodes ~= num_of_flights. This made the number of nodes and the number of edges (flights) go up significatnly, but it still performed well. Since we had virtually no time for parllelization, we simply gave each thread a path (e.g. home->vacation->conference->gome) to calculate.
The last two days were a complete nightmare, since most of our submissions ended up in seg fault (due to writing permissions on the cluster we didn't have and didn't know we didn't have) or in Output errors (due to wrong number of spaces between solutions in specioal cases - no solutions). With the cluster updatting every now and then, you can imagine how difficult the debugging went when everything worked locally. I can't post any serious cluster results. since we never had any completely valid submissions, except maybe our last, where the report said the contest was closed, so we never got the results. We're hoping they'll accept it because we submitted it before the deadline. Our best time on newinput4 was around 80 seconds with 2 threads, 10-12 seconds of that being the input parsing. I hope this will enable you to comapre yourself with us if you wish. That's it from us.

Best regards,
Nenad

CompileScale6 1.80
CompileScale5 0.70
CompileScale4 0.79
CompileScale3 0.50
CompileScale2 0.32
CompileScale1 0.08

And, by the way, congratulations to everybody for excellent results :)

Best regards,
Nenad

yeah, I'm really amazed by the <2 sec results and would love to see the source. what algoritm/datastructure did you guys use? did you optimize for processor caches?

Rock the bits!

CompileScale6 3.33 (CPU usage:627%) - source in attachment

It would be nice to other attach their sources.

Also, congratulation to all who finished tasks.

Fichiers joints: 

Fichier attachéTaille
Télécharger my-source.zip28.34 Ko

You can't use vector for visited cities in dijkstra. Please attach source, I'll attach mine when I'm at home.

Citation :

Heye (aka. slevin7) a écrit :

yeah, I'm really amazed by the <2 sec results and would love to see the source. what algoritm/datastructure did you guys use? did you optimize for processor caches?

We use very fast reading with mmap and handmade parsing, handmade time converters to epoch and vice versa, linear Dijkstra algorithm which is called 4 times. Also we use Intel Parallel Amplifier and it helps us very much.

Wow :)
Could you attach your source as well Nikolay?

I attached mine here.

PS: This should be the right solution for your problem eviltosha:

“Work Hard” Proposition :
Price : 771.4
F1-1-Home (11/1 5h0min)/Conf (11/1 17h30min)-1100$-70%
F1-5-Conf (11/8 7h0min)/AB (11/8 14h30min)-1$-70%
F1-6-AB (11/8 17h0min)/Home (11/8 19h30min)-1$-70%

Fichiers joints: 

Fichier attachéTaille
Télécharger heye.zip156.78 Ko
Rock the bits!

Citation :

eviltosha a écrit :

You can't use vector for visited cities in dijkstra. Please attach source, I'll attach mine when I'm at home.

Why not?

Assume there are 5 cities: Home C1 C2 C3 Conference
Every city has an id corresponding to the entry in the vector (home has 0, Ci has i and Conference has 4)

I use an ordered container for all possible flights from my current spanning tree, each of these potential next flights contains a vector with 5 entries{0, 0, 0, 0, 0}

I start at home {1, 0, 0, 0, 0}
I take flight to C2 {1, 0, 1, 0, 0}
I try to go to home again, home has id 0 so I check if(vector[0]) -> can't take flight.

When I go to the conference or vacation I set the vector back to its initial state.
That's it ;)

Rock the bits!

Here are my results:

CompileScale6 1.19 (CPU usage:154%)
CompileScale5 1.88 (CPU usage:152%)
CompileScale4 1.00 (CPU usage:215%)
CompileScale3 0.37 (CPU usage:64%)
CompileScale2 0.62 (CPU usage:7%)
CompileScale1 0.05

The first idea we had was using Dijkstra but we simply forgot the idea along the way, which means that we didn't change the algorithm but improved it as much as we could.

I can share the sources but i don't think it would be very useful since the times aren't very good compared to yours.

What I can do is share the "README.txt" where you can find most of the ideas we used to improve the code. Cf attached file :)

Fichiers joints: 

Fichier attachéTaille
Télécharger readme.txt2.38 Ko

Citation :

Heye (aka. slevin7) a écrit :

Wow :)
Could you attach your source as well Nikolay?

Of course :)

Fichiers joints: 

Fichier attachéTaille
Télécharger cats.zip83.34 Ko

Hi there,

here are our results:

CompileScale6 1.01 (CPU usage:557%)
CompileScale5 0.85 (CPU usage:233%)
CompileScale4 0.88 (CPU usage:220%)
CompileScale3 0.94 (CPU usage:37%)
CompileScale2 1.43 (CPU usage:141%)
CompileScale1 0.08

Our code is available at GitHub (Commit messages are all in German though):
https://github.com/martin-helmich/intel-ayc-2012

Martin

Nikolay, your solution works wrong on this test. It finds path with repeating city in first segment.

Fichiers joints: 

Fichier attachéTaille
Télécharger scenario00.tar.gz751 octets

Citation :

eviltosha a écrit :

Nikolay, your solution works wrong on this test. It finds path with repeating city in first segment.

Maybe, but it's impossible case in the real world. It's easy to find hard test for every solution. In the real world two tickets cost together more than every of these separately do.

I asked this question to organizers several times and always answer was that we must check for non-repeating cities. Moreover, as follows from your README, you didn't write dijkstra because of probability of negative edges, which is pretty much the same.

Also I can create test, where every two tickets cost together more than every of these separately do, but because of layover time limit, your solution will be able to find optimal path with repeating cities. Actually, I've wrote that test on the forum at the beginning of contest.

Hi!

It would be a good idea to check correctness of the solution for such cases. I implemented Dijkstra too, but I dealt with cycles: whenever I detect a cycle in the answer, I remove each edge from the original graph and rerun Dijkstra (at the end, when no cycles are detected for each of these runs, I gather all the paths and select the cheapest one out of them).

Andrei

Citation :

Heye (aka. slevin7) a écrit :

Quote:

eviltosha wrote:

You can't use vector for visited cities in dijkstra. Please attach source, I'll attach mine when I'm at home.

Why not?

Assume there are 5 cities: Home C1 C2 C3 Conference
Every city has an id corresponding to the entry in the vector (home has 0, Ci has i and Conference has 4)

I use an ordered container for all possible flights from my current spanning tree, each of these potential next flights contains a vector with 5 entries{0, 0, 0, 0, 0}

I start at home {1, 0, 0, 0, 0}
I take flight to C2 {1, 0, 1, 0, 0}
I try to go to home again, home has id 0 so I check if(vector[0]) -> can't take flight.

When I go to the conference or vacation I set the vector back to its initial state.
That's it ;)

That is useless, you have no way of dealing with cycles like that in Dijkstra (and if you find another way, the complexity will raise). If you do what you say, you might lose some of the possible solutions.

Andrei

Citation :

Heye (aka. slevin7) a écrit :

Quote:

eviltosha wrote:

You can't use vector for visited cities in dijkstra. Please attach source, I'll attach mine when I'm at home.

Why not?

Assume there are 5 cities: Home C1 C2 C3 Conference
Every city has an id corresponding to the entry in the vector (home has 0, Ci has i and Conference has 4)

I use an ordered container for all possible flights from my current spanning tree, each of these potential next flights contains a vector with 5 entries{0, 0, 0, 0, 0}

I start at home {1, 0, 0, 0, 0}
I take flight to C2 {1, 0, 1, 0, 0}
I try to go to home again, home has id 0 so I check if(vector[0]) -> can't take flight.

When I go to the conference or vacation I set the vector back to its initial state.
That's it ;)

That is useless, you have no way of dealing with cycles like that in Dijkstra (and if you find another way, the complexity will raise). If you do what you say, you might lose some of the possible solutions.

Andrei

Citation :

Heye (aka. slevin7) a écrit :

Quote:

eviltosha wrote:

You can't use vector for visited cities in dijkstra. Please attach source, I'll attach mine when I'm at home.

Why not?

Assume there are 5 cities: Home C1 C2 C3 Conference
Every city has an id corresponding to the entry in the vector (home has 0, Ci has i and Conference has 4)

I use an ordered container for all possible flights from my current spanning tree, each of these potential next flights contains a vector with 5 entries{0, 0, 0, 0, 0}

I start at home {1, 0, 0, 0, 0}
I take flight to C2 {1, 0, 1, 0, 0}
I try to go to home again, home has id 0 so I check if(vector[0]) -> can't take flight.

When I go to the conference or vacation I set the vector back to its initial state.
That's it ;)

That is useless, you have no way of dealing with cycles like that in Dijkstra (and if you find another way, the complexity will raise). If you do what you say, you might lose some of the possible solutions.

Andrei

How do you define a cycle? I am building a directed acyclic graph, there is no possibility of a cycle.
Paths can merge, but that shouldn't be a problem. When you leave a node, there is no way to go back to it again.I cannot visit the same node (take the same flight) twice because I keep a set of visited nodes that I check before taking a flight.
I might have made a mistake but right now I can't see it, can you generate a scenario where my algorithm fails?

Rock the bits!

You build acyclic graph on flights not on cities. Probably I could construct such test, but for some reason I can't run your solution =(
It says "bash: ./run: No such file or directory", but that file exists and has all rights to be executed.

Did you try this?
$ cd release/
$ make
rm -f ./.depend
g++ -std=c++0x -O3 -Wall -o obj/main.o -c src/main.cpp
g++ -o run obj/main.o -ltbb -ltbbmalloc
$ cd scenario/
$ ./script.sh

work and play_hard.txt should be generated in the scenario folder...

Rock the bits!

I tried to use run file from archive, because I failed to compile your solution. Here's log from make:

$ make
g++ -std=c++0x -O3 -Wall -o obj/main.o -c src/main.cpp
src/main.cpp:629:46: error: ‘typename std::remove_reference< >::type&& std::move(_Tp&&) [with _Tp = std::vector, typename std::remove_reference< >::type = std::vector]’ is not ‘constexpr’
src/main.cpp:629:46: sorry, unimplemented: non-static data member initializers
src/main.cpp:629:46: error: in-class initialization of static data member ‘_f’ of non-literal type
src/main.cpp: In member function ‘void FlightSixpack::set(Flight*)’:
src/main.cpp:648:3: error: ‘_f’ was not declared in this scope
src/main.cpp: In member function ‘std::vector::const_iterator FlightSixpack::begin(bool) const’:
src/main.cpp:654:10: error: ‘_f’ was not declared in this scope
src/main.cpp: In member function ‘std::vector::const_iterator FlightSixpack::end(bool) const’:
src/main.cpp:662:11: error: ‘_f’ was not declared in this scope
src/main.cpp:666:11: error: ‘_f’ was not declared in this scope
src/main.cpp:668:2: warning: control reaches end of non-void function [-Wreturn-type]
src/main.cpp: In member function ‘std::vector::const_iterator FlightSixpack::begin(bool) const’:
src/main.cpp:655:2: warning: control reaches end of non-void function [-Wreturn-type]
make: *** [obj/main.o] Error 1

what compiler are you using?
I use gcc 4.7.2 and it compiles fine.
You can remove the move(), it's probably done by the compiler implicitly anyways.

vector _f = vector(6);

I don't know if that is a valid declaration... I'm not that experienced in c++... but it worked for me...
vector _f(6);
didn't work.

The less than and greater than signs are intrepreted as html...
"vector" is always "vector$lt$Flight*$gt$"

Rock the bits!

Here comes the test! Probably I've run it wrong (please say if you have different results), but on my computer your solution doesn't find any path, while path exists:

Price : 702.8
F1-3-Home (11/1 0h10min)/C2 (11/1 2h0min)-1000$-70%
F1-5-C2 (11/1 5h0min)/C3 (11/1 6h0min)-1$-70%
F1-8-C3 (11/1 6h30min)/C1 (11/1 7h0min)-1$-70%
F1-6-C1 (11/1 10h0min)/Conf (11/1 17h0min)-1$-70%
F1-7-Conf (11/8 7h0min)/Home (11/8 14h30min)-1$-70%

Fichiers joints: 

Fichier attachéTaille
Télécharger scenario00.tar.gz683 octets

Yes, this might mean doing that little "visited cities" thing in Dijkstra to get around cycles really loses some possible solutions.

Andrei

Yes, this is indeed a problem.... The most simple fix would be to not do the merging. But that would be essentially the same as the reference solution, because every possible path would be built into the graph...
Evil little scenario... :/

My avatar expresses my current feelings pretty good if you replace the text with ALGORITHM Y U NO WORK?

How did you get around this problem eviltosha?
You said you used dijkstra too, can you share your sources?

Rock the bits!

So, the special case is, that the condition for rejecting this path is only dependent on the previously taken path. So what I could do is, walk the way back, removing all flights taken until reaching the first branch.
BUT I would have to count the number of branches and decrease the counter every time I reject a path because of that situation... So keeping track of that is kind of ugly... Hmmm I feel it coming, another sleepless night... ^^

Rock the bits!

I think you can relax, contest has already ended. And now it's up to judges to rate our solutions. =)
I don't think there exists linear algorithm, which finds path without repeating cities. Maybe I'm wrong, so if you have one, write about it.

Yeah I know, but I can't stop thinking about it.... ^^
Hmm, I will try, this makes the whole thing ugly, but it might work.
How did you solve it? Or did you even solve it?

Rock the bits!

No we didn't. We just use 2 algorithms - firstly linear, and if it finds path with repeating cities, we recompute path for that city with exponential algorithm (which, of course, well-optimized).

Ahh yes, I remember... that was you who said that in another thread...

Rock the bits!

Hmm, does anyone have explanation for this ( didn't noticed it earlier):

Yesterday, when I finished a part of the parallization process, I got these results:
CompileScale6 3.36 (CPU usage:548%)
CompileScale5 0.88 (CPU usage:102%)
CompileScale4 18.75 (CPU usage:96%)
CompileScale3 7.76 (CPU usage:96%)
CompileScale2 6.14 (CPU usage:79%)
CompileScale1 2.17

, and this are results of non parallelized algorithm:
CompileScale6 16.73 (CPU usage:91%)
CompileScale5 1.50 (CPU usage:55%)
CompileScale4 1.25 (CPU usage:65%)
CompileScale3 0.72 (CPU usage:24%)
CompileScale2 0.58 (CPU usage:21%)
CompileScale1 0.05

You can clearly see acceleration on CompileScale6, but why are other so slow, or even slower than serial algorithm?
Just to mention, there is very little communication between threads, so even on my dualcore, if I run 12threads vs 1thread on the same scenario, the 12threads scenario is much faster than 1thread, but never slower (look at the CompileScale 2, 3, 4)?

Does it means that Intel benchmark cluster was slowed down by other users, who probably submited their solutions at the same interval when I was?

EDIT: Also, on the other singlecore machine there is few% of deacceleration between 12 and serialized execution, which guarantee me that there is no big overhead for assigning tasks to threads.

eviltosha...
here you go:

“Work Hard” Proposition :
Price : 702.8
F1-3-Home (11/1 0h10min)/C2 (11/1 2h0min)-1000$-70%
F1-5-C2 (11/1 5h0min)/C3 (11/1 6h0min)-1$-70%
F1-8-C3 (11/1 6h30min)/C1 (11/1 7h0min)-1$-70%
F1-6-C1 (11/1 10h0min)/Conf (11/1 17h0min)-1$-70%
F1-7-Conf (11/8 7h0min)/Home (11/8 14h30min)-1$-70%

The source file is from the development branch, so it's kind of messy... But it should work

Fichiers joints: 

Fichier attachéTaille
Télécharger main.cpp60.81 Ko
Rock the bits!

Pages

Laisser un commentaire

Veuillez ouvrir une session pour ajouter un commentaire. Pas encore membre ? Rejoignez-nous dès aujourd’hui