Recent posts
https://software.intel.com/en-us/recent/515424
enPost-mortem
https://software.intel.com/en-us/forums/p1-a2-consecutive-primes/topic/283077
<p>Problem 2 was a fun one (the one i liked best to be honest) these are the optimalisations i used:</p>
<p>1) Prime generation</p>
<p>Generating primes is fun and can be done quickly but there's one thing even better not having to calculate them! However a table of all primes < 2^32 turned out to be about 800+ megs, too massive,the load time will have eaten me alive. Plus a binary search to find the proper start end end indexes wouldn't have been too snappy either.</p>
<p>But what if there's a way to cut that massive table down in size and pretty much give us a direct index to the numbers we want? Well all primes > 5 you can express as 30k1, 30k7, 30k11, 30k13. Which is exactly 8 bits! Cuts the table down to a nicer 136 megs and it'll give you instant primes!</p>
<p>2) Determining if its a power or not</p>
<p>2a) Do we even need to bother checking a number? </p>
<p>Numbers that are a power of something have a funny property in binary, if you look at the lowest nibble it will *NEVER* be 2,6,10,12 or 14, so that quickly gets rid of about 30% of numbers with a single 'and' instruction and a few compares.</p>
<p>2b) The numbers that are left after that:</p>
<p>The sum of all primes in play is 425649736193687430, which square root is 65241837.5119591 meaning the highest number we would ever see as a base is 65241837. So i figured i'd loop though all numbers [2..65241837] ^ [2..100] see if its below 425649736193687431 and store the base+power+value (only the lowest power+base for numbers that are multiple powers) in a lookup table (i bet by now you all are going 'damn this guy *really* likes his lookup tables' ) </p>
<p>Well turns out that table gets *BIG* really quickly but is quite managable for powers > 2. </p>
<p>But what about the powers of 2 then? </p>
<p>Well powers of 2 have the interesting property that the lower nibble always is 0,1,4 or 9, so if its that run a good old sqrt and see if its a square or not (i tried a lookup table here too but sqrt turned out to be faster)</p>
<p>Further improvements for powers > 2</p>
<p>Initially i had them all in a big sorted list which i did a binary search on which worked well but due to the size of the table not the best performer. So the speed it up i turned it into a really basic hash table using bits 21-43 of the number as a hash which gave me less then 10 numbers in most buckets which is stupidly fast to search through.</p>
<p>3) Threading</p>
<p>Just a parralel loop though the primes adding them up, not much to it really this was by far the easiest of the 3 problems to thread.</p>
<p>Most of my time on this problem was spend trying to figure out why the 40 core windows MTL box *REFUSED* to use all of the cores using both openmp or TBB, you always ended up on a random processor group (either cores 0-9 or 10-39) but never on all of them. Turns out that in the intel v11 compiler which was on the box OpenMP was not aware of processor groups (new in win7/2008R2) and TBB (which was aware) had a subtile bug in the code that assigned threads to cores. Found the bug made a quick work around (Details are somewhere in a thread in the TBB forum) and figured my solution would definitly have an edge over other ones that would end up not using all cores... and then intel moved us all out of the box cause it 'had issues' (i said it before, i'll say it again: BOO!) </p>
<p>In the end i ran out of time and the code ended up being a bit (and by a bit, i mean ALOT) messy but functional.</p>
<p>Warning due to the *MASSIVE* lookup tables the code is a whopping 112 megs compressed.</p>
Tue, 28 Jun 11 19:08:47 -0700lazydodo283077Has anyone tried 574395734 cycles?
https://software.intel.com/en-us/forums/p1-a3-running-numbers/topic/283369
<p>I was fustrated that my code cannot work the 4774 cycles until I found out on this forum that adding cycle 0 makes it work. I agree with the rest of you that if cycle 0 is required then the walkthrough is very misleading.</p>
<p>But I don't feel good with one test case only, I want to verify the other test case as well. Has anyone tried the 574395734 cycles test case? I want to confirm that "cycle 0" is necessary before I proceed.</p>
Sun, 12 Jun 11 13:05:48 -0700kayson283369More Examples
https://software.intel.com/en-us/forums/p1-a3-running-numbers/topic/283460
<p>Since the Example thread in the previous two stages always was a hit lets have one for this one.</p>
<p>Since i'm not entirely sure about my implementation i'll post a few easy to verify ones and we'll work our way from there to the more interesting stuff, as always huge disclamer : 'I validated these using the same code that solved the intel examples correctly, i cannot however rule out any mistakes, if you are getting different results its probably me screwing up not you'</p>
<p>The format is the same as the intel examples with one extra column</p>
<p>[Source] [ByteAdd] [DwordAdd] [SolveCycles]</p>
<p>8AF59446C631DDCF10B5493FCBA37056 E76117A796CA78991315228184795EA8 39F890F96E633900FDCCB8F8DCF9D096 7196 <br />255827EB68F924D5734A6FB8A5357EA6 0CD68CE5D0E3445B97009E188FACC76A 0005E57CB24F5304B9371AC60ED533DA 21593 <br />80BE9ECCEF2D2E1654B5F11D25F77F06 5033D9D66621C0FFE36D1FCFBACB9C2B A41B72AC586B0CE69AB35EF9EBB51876 7179 <br />1DC95205DF9604403434AA2E4DAE8191 1102BDA8D2181BF23C3849F48CB0831C DD2CAEF30CA06E346FE4DF4AE0130B4F 5039 <br />EE430DEE716BDB4120542945B3E2B021 FFC26872229809122B4FDA94CEBAB0AE 915FA2CE123D51A723696017558562BF 11419 </p>
Tue, 07 Jun 11 18:34:57 -0700lazydodo283460Algorithm use
https://software.intel.com/en-us/forums/p1-a2-consecutive-primes/topic/283854
<p>Hi there,</p>
<p>Just a thought regarding the "original creation" phrase in the rules:<br />
how would it be viewed if I said, for example, "I implemented algorithm<br />
ABC by John Smith to solve this problem"? Assuming that the algorithm<br />
is in the public domain, of course, and available to be used by anyone (for non-commercial use at least),<br />
is that all OK?</p>
<p>This wasn't really much of an issue for the Maze of Life, but there's<br />
plenty of commentary out there about calculation of prime numbers. It's<br />
hard to say whether implementing an advanced algorithm would be<br />
practical in a limited time, but well, it'd be interesting to know :)</p>
<p>Thanks!</p>
Wed, 11 May 11 15:54:46 -0700yoink283854Sidecases, what to report?
https://software.intel.com/en-us/forums/p1-a2-consecutive-primes/topic/283902
<p>From the example case</p>
<p>sum(17:19) = 36 = 6**2<br />sum(5:13) = 36 = 6**2</p>
<p>I'm able to tell that if several sequences produce the same sum they all have to be reported, but how do we handle duplicates on the power side of things?</p>
<p>for instance:</p>
<p>sum(20063:32987) and sum(2097031:2097287) both produce 33554432 (frist draft of my app, results could be wrong) no problem just report both lines, yet 33554432 gives an interesting sidecase being both 2^25 and 32^5.</p>
<p>Do we just pick one? if so which one should we pick? does it matter? do we pick both? if so how do we report?</p>
<p>like this</p>
<p>sum(20063:32987)=33554432=2**25=32**5<br />sum(2097031:2097287)=33554432=2**25=32**5</p>
<p>or like this:</p>
<p>sum(20063:32987)=33554432=2**25<br />sum(20063:32987)=33554432=32**5<br />sum(2097031:2097287)=33554432=2**25<br />sum(2097031:2097287)=33554432=32**5</p>
<p>Also note this problem is not limited to 2 duplicates, for instance sum(7063741:7064419) gives us 268435456 which I can make out of 2^28, 4^14, 16^7 and 128^4</p>
<p>Please advise.</p>
<p></p>
Mon, 09 May 11 22:02:05 -0700lazydodo283902What extra skills required?
https://software.intel.com/en-us/forums/2011-apprentice-entry-level-problems/topic/283963
<p>I am familiar with c/c++ coding ,system programming..including threads(pthread.h),semaphores<br />I have done decent programming on SPOJ and other sites.</p>
<p>In which direction ,Should i think .in order to solve the problem<br />do i have to execute different instances of recursion as individual threads. if yes how many maximum threads are allowed.?<br />i have seen some TBB code samples...it contains headers files not in ANSI C ,so in order to solve these problems, do i need those header files.?</p>
<p></p>
Fri, 06 May 11 07:02:10 -0700adityaork283963