The cluster is momentarily down

The cluster is momentarily down

Hi,

You maybe noticed that the cluster is not responding since this afternoon. It seems that our internet provider is currently having troubles to satisfy the request.

We've followed the problem and we will probably have more informations in the next hours. Untill then, stay in touch on the forum.

We are sincerely sorry for the inconvenience.

53 posts / 0 nouveau(x)
Dernière contribution
Reportez-vous à notre Notice d'optimisation pour plus d'informations sur les choix et l'optimisation des performances dans les produits logiciels Intel.

What is expected time of benchmark recovery? Should we wait for it today?

Citation :

eviltosha a écrit :

What is expected time of benchmark recovery? Should we wait for it today?

+1

Yeah, I'm also waiting, because I just finished implementing a new algorithm and would like to see the results, not speed wise, but correctness. (I know that a successful run doesn't prove the code, but at least it makes me more confident.) But you are probably waiting for the provider too....
Could you otherwise provide some more scenarios for testing?

Rock the bits!

I still have no information from the provider and I'm still hoping to have a response soon.

When this problem will be solved, I will try to add more scenarios.

I momentarily fixed the problem but it could happen again in a close futur.

The tests are run with higher intervals (less often), so it will maybe take from 15 to 30-45 minutes for you to get your results back.

Don't forget to clean your project before sending it (make clean and remove the 'run' file)

Hi!

I still have trouble with the benchmarking cluster - I submitted 3 times in almost 2 hours and nothing happened. What's more, the old results have been cleared out from the page. Any thoughts on this?

Thanks!
Andrei

Would you see our submit? Timestamp is 1353533402. We can't understand what the problem, we always get "CompileUnitTest ERROR - Output error". Locally it works normally.

Can we get more detailed verdict?
How can we understand that our program works correctly on benchmark?

Hi Andrei,
What Cédric did was the best he could do. If the problem comes from the provider, it's not possible to determine exactly when the benchmark will be back.
It is very frustrating for us but there is nothing we can do except waiting.
Try to optimize what you can for now (you should have some work to do, we still have 3 weeks before the end of the contest, if you did everything you could do, then you should not be far away from the end and be capable of waiting some more days ! :) ), you can create a copy of your program to try some more optimizations or do something make these further days gainful !
Moreover, if possible, you can try to test your program on your own computer.

Cédric will certainly give us more news later ! But for now, we can only wait and see ! ;)

Be patient ! :D
Regards,

Timothé Viot,
Engineer student, Insa Rennes 1
France

For the missing submission, could you take a look at your administration page ? It should be back now. I don't know what really happened but some submission were indeed missing.

It's probably due to the provider problem, we maybe ran into synchronization problems. I'm also using a very slow connection to substitute the provider's one, so it's possible that the problem occurs again. If someone notice missing "old" submissions, I would appreciate to be informed.

You will also notice that you last run is now available.

For your output problem, it's almost nothing :
Your outputs number have unsignificants zeros after the point when the output is a full integer number. If this indication is not enough, send me an email at cedric.andreolli@intel.com, I will provide you your output and the expected one.

As I have understood..

Our program outputs 1234.00
Your program outputs 1234

Why our answer is incorrect? I would like to repeat my old question, how many digits after decimal points should we output?

Please, answer me absolutely exactly because different i/o functions in C++ use different policies for printing double-numbers.

The answer is that it depends. The program we gave you outputs float on std::cout. The default behavior outputs 6 digits. As always, the output must stick to our output.
We evaluate the solution with a "diff" on the cluster. This is the reason why we insist for programs to output exactly the same thing that our program does.

Hi,

Is the problem solved?

Thanks

Axel Shaïta, U.F.R Sciences Reims

Yes still temporarily but it is solved :)

The results are just refreshed less often (30mn to one hour)

Ok , because I've uploaded my solution this morning and I still had no result. I'll upload again ;) !

Thanks!

Axel Shaïta, U.F.R Sciences Reims

@ Axel : It looks ok isn't it ? I have a test run from a Axel :)

Yes, it's ok! Thanks!

Axel Shaïta, U.F.R Sciences Reims

Benchmark doesn't work for me. Last result is almost 7 hours ago, but I sent program 3, 2 hours ago and several minutes ago. Nothing new has appeared there.

I have a result on the cluster that you can't see for the moment. You will probably just have to wait a little bit. The test ran was the same that the last one, you succeded it and the time is not as good as the previous one (but still close) :)

And the front end server problems are back !

It seems that the provider is one more time having some troubles. So the cluster seems to be unavailable since 300 minutes ago.

It is maybe just a momentarily problem. We are doing our best to solve the problem.

We were just wondering why our new submissions are not displayed. Do you know when the cluster will be available again? You mentioned to release new tests on the cluster tomorrow. Is this still the ETA?

Hi ! How fine is the server now ?

Citation :

Pierre B. a écrit :

Hi ! How fine is the server now ?

I think, it's down :(

Hi there,

can you say anything about the benchmarking cluaster - seems like it's off for about 20hrs... When will it be available again?

I still have trouble submitting anything to the benchmarking cluster.

Citation :

Cristian K. a écrit :

I still have trouble submitting anything to the benchmarking cluster.

Unfortunatly me too, I hope it will be fine tonight ! GL Intel teams ;)

I still have no idea when will the servers be back. This is a really big problem right now but it's completely not in our scope.

We just have to wait for now because the problem comes from the provider.

Would it be possible to setup another cluster?
I feel like working blind right now, because I sometimes get output errors on the cluster when all my tests at home run fine.
I think we don't need the runtime right now because you can measure your speed improvements at your home computer, but a small setup just for correctness checks would be nice.

Rock the bits!

I thought the Cluster is for testing the scaleability and not for correctness checks. Maybe it is a good idea to release new even bigger input files with a reference solution. Then we can check the correctness and scalability at home. On our university we have a compute server for that purpose. Maybe students from other universities have similarly options.

Citation :

Hannes T. a écrit :

I thought the Cluster is for testing the scaleability and not for correctness checks. Maybe it is a good idea to release new even bigger input files with a reference solution. Then we can check the correctness and scalability at home. On our university we have a compute server for that purpose. Maybe students from other universities have similarly options.

We dont have this chance, I prey for the Intel cluster.

According to the situation maybe it's possible to extend the contest time?

I just have some informations. The problem is due to the amount of data transferred between the cluster and the front end server.

They will re-establish the connection soon but to avoid this problem before the end of the contest, we will slow down the refreshes between the cluster and the front end server. This means that you will have result updates only every 30 minutes (instead of 5 minute before).

I can't give you the exact time when the server will be back, but it should be in a quite close future.

I am very sorry for the troubles you went through and I will do my best to help you solve your problems.

So if you are getting through "Output error" troubles, you don't know why and you are afraid not to be prepared in time, send me an email at cedric.andreolli@intel.com. I'm often not able to answer during the day (8am to 7pm French time) but I'm available after that time.

Good news so far. What about bigger test cases on the cluster and/or for download?

I will see tomorrow if the server is up and if there is not any troubles.

That is not the priority as long as the front end is not back :)

Citation :

Cédric ANDREOLLI (Intel) a écrit :

I will see tomorrow if the server is up and if there is not any troubles.

That is not the priority as long as the front end is not back :)

The contest will be extended?

Hello,

If I understood well, the cluster should be usable by know, but we aren't getting any results. Is the cluster available?
This is quite crutial for us know, because, until recently, we've had much trouble with the correctness of our serial algorithm, so these will be our first tests...

Best regards,
Nenad

The front end server (OVH) seems to be up right now after almost 2 days...

Such a good news ! Refresh time is 30min until Friday?

Hi there,

nice to heare!
We committed some versions, while the cluster was down. So which one is it that compiled now there was only one compiling since the cluster was off. Is it only the last one?

@Pierre : Yes
@marcel : The last one

It seems to me that the cluster ist down again. I submitted a program one hour ago and there is no result yet. The last result of the cluster was even worse then the result two days ago. But the program should have run faster. Is there heavy load on the cluster nodes?

Citation :

Hannes T. a écrit :

It seems to me that the cluster ist down again. I submitted a program one hour ago and there is no result yet. The last result of the cluster was even worse then the result two days ago. But the program should have run faster. Is there heavy load on the cluster nodes?

We have exactly the same problem. The last submission was worse than 2 dayx ago but the program was the same. And now, we do not have the results that were submitted today.

You probably have to wait a little bit cause there is a big amount of file to run right now.

How can we get response from the benchmark? One day remains until the end and we can't check a compilation, correctness of output etc. It's sad..

I know and I am very sorry for it but OVH has a big responsibility in this situation.

A lot of candidates are trying to submit right now and the cluster queue is quite overloaded.

You have to know that each time you submit a new zip file, you go to the end of the queue. So the best way to have access to the benchmark is to submit and wait for the result. 2 or 3 hours right know should be the biggest time you have to wait for your results.

Some of you asked about the possibility to add few days to the dead line. The problem is that I am not allowed to take this decision.

I have submitted solution about 7 hours ago - no response. Before that submission, more than 4 hours earlier, I submitted another one, no response as well. Last update was 950 minutes (almost 16 hours) ago.

More than 12 hours - no response.
May we duplicate our final submission by e-mail to be sure you will receive it?

I increased the number of tests that can be run at the same time on the cluster. This means that you will not have the full machine anymore but it should be enough to decrease the time you are waiting.

Can you give us a hint how long the average wait time is?

Rock the bits!

I think you should have it right now.

The procedure should really be faster from now.

From my point of view it isn't. I submitted half an our ago. Still there is no result. We got our results from yesterday today, too.

Pages

Laisser un commentaire

Veuillez ouvrir une session pour ajouter un commentaire. Pas encore membre ? Rejoignez-nous dès aujourd’hui