With the Beta release of the Intel Distribution for Python 2017, I was tasked to step up in my new TCE position and offer some of the initial support for collateral on the Intel VTune-Amplifier feature to profile Python code. I’ve been dabbling with Python for a few years, due to my Health & Life Science affiliation, but haven’t exercised those programming muscles in a while.
Obviously, to try the new VTune-Amplifier Python features I needed a code. The bulk of my Python coding has been to solve Project Euler problems. These run pretty quickly and, of the 70 problems I’ve solved (at the time of this writing), I wasn’t sure which of these might run the longest to give the profiler something “meaty” to process. I wanted something that would have enough execution time to make the runs worthwhile and actually be able to generate some meaningful analysis. I remember at least one of my initial solutions ran overly long, but I couldn’t recall which one that was, so I chose one that dealt with prime numbers (a favorite topic of mine). The actual run to solve the specific problem ran in under 2 seconds, so I bumped up the search space and got a more prolonged execution time.
The analysis wasn’t anything that was unexpected (see below), but it was nice to be able to exercise the feature in VTune.
With the idea of getting something that would exercise more of the libraries and enhanced packages within the Intel Distribution for Python 2017 Beta (e.g., NumPy, SciPy), I remembered a project I had written for fun that used the BLAS routines with a Fortran code. This goal of the code was to identify the most occupied space on a traditional Monopoly game board. The algorithm used involves a Markov Chain computation with a repetitive matrix-vector product being calculated. The matrix holds the probabilities of moving from one space to another through a dice roll or as a result of drawing a Chance or Community Chest card and the vector is the current probability of landing on a given space from the initial condition of starting at Go. At some point the vector values from one multiplication to the next won’t change and this will signal the completion of the computation.
I found this problem scanning the Web many years ago which led me to the site “My Science Fair Project” by Andrew Pinzler. He described the setup of the transition matrix and how you performed the computations, his results, and how you could use the derived knowledge to your advantage in playing Monopoly. I wrote to Andrew after I’d worked up my Fortran code to thank him and let him know I’d gotten similar results using double precision arithmetic.
Thinking I could adapt my Fortran code to Python, I went searching through my personal archives. I couldn’t find anything. I went back to the web to see if I could locate the original website that had inspired me. Lucky for me the Internet never forgets anything. Andrew’s site is still up and I got all the setup details once again. (You can tell Andrew has gone on to bigger and better things. The site hasn’t been updated; it still references Parker Brothers as the owner of Monopoly and icons near the bottom claim the site is best viewed with Netscape Navigator 4.0.)
Banging away at my keyboard with a handy Python reference by my side, I recreated the program. If you're interested, my code is attached to this blog. (I had to put it in a zip file, but it is just the Python code in there. I was using Python 3.5 on Linux.) One thing that you’ll notice is that I made one modification from the original details described by Pinzler. The “Just Visiting/In Jail” square is actually two separate destinations that occupy the same corner on the board. These were treated as the same square in the website description, but I decided to set them up as two separate destinations. Yes, the dice roll destinations from both are the same, but landing in Jail is not the same as simply moving through the space on your way to States Avenue. Thus, I set up a “phantom” square for “In Jail” that is located between “Boardwalk” and “Go” within the transition matrix. After mechanically setting up the dice roll probabilities, I needed to fix up the matrix to skip over the phantom state and set its transitions to other properties to be the same as the “Just Visiting” square. Also, all transitions that go directly to jail (Do not pass GO, do not collect $200) were sent to the phantom square.
In order to make a more convincing examination of the advantages from the Intel Distribution for Python 2017 Beta, I initially hand coded the matrix-vector product with lists of lists as the matrix representation. It was my intention to examine this within VTune, find a majority of time spent in that function, and then contrast the use of NumPy matrices for these computations. Unfortunately, I soon realized I would have been better served trying to get a ride on the Reading Railroad.
There are only 40 squares on the traditional Monopoly board. With my added “In Jail” square, the matrix I needed was only 41x41. Running the code with my (poorly written) matrix-vector multiplication takes 188 iterations (rolls) before the solution stabilizes. It executes in the blink of an eye.
So, back to the drawing board. Maybe I should try something with recursion. (Something other than the boring N-Queens, of course.) It might not need special number-crunching packages, but it would give me something that executes for a sustained amount of time and I might have better luck analyzing the runtime with VTune-Amplifier.
While it doesn’t make a good test case for what I was setting out to do, it was a fun project (again). I tried to think up modifications that I might attempt that would extend the size of the matrix and the execution time. Casting around on the web I did find Ultimate Monopoly by jonizaak on the Deviant Art site. Something like that would create a larger matrix and could take quite a few more iterations to reach a solution. However, it’s not a real game, though it does look like it would be a fun way to spend some time on a rainy afternoon with friends. Or with snakes that can roll dice.
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804