Hacking Threading Building Blocks into Cygwin, Part 3

My last post about hacking Threading Building Blocks into Cygwin ended with an "Unknown OS" error in file src/tbb/tbb_misc.h. This was a good sign because it meant I had things configured correctly enough for my Cygwin GCC compiler to actually start building TBB.

To reiterate, what I'm trying to do is indeed a "hack"; I'm not at this point doing what could be considered a nicely rendered port of TBB into the Cygwin environment. My objective is simply to find and make the modifications necessary so that I can launch Cygwin and apply TBB within that environment.

Unknown OS redux, redux, ...

The TBB source code in many areas is tailored for individual operating systems(_WIN32, _WIN64, __linux__, or __APPLE__). Where the code cannot be compiled due to an undefined OS, the compiler stops and displays messages of this type:


In file included from ../../src/tbb/concurrent_queue.cpp:33:
../../src/tbb/tbb_misc.h:86:2: #error Unknown OS
make[1]: *** [concurrent_queue.o] Error 1


I had defined my operating system to be Cygwin. Hence, everywhere in the TBB code where specific code was defined for individual operating systems, I had to modify the code to provide a set of code to be compiled when the defined OS is Cygwin. Since I was building TBB using GCC, and since Cygwin is in many ways similar to Linux, I often applied the code assigned for __linux__ for my Cygwin TBB build. In most cases, this worked.

Taking tbb_misc.h as an example, the original code included this set of OS-specific source:


namespace tbb {

static volatile int number_of_workers = 0;

#if defined(__TBB_DetectNumberOfWorkers)
static inline int DetectNumberOfWorkers() {
return __TBB_DetectNumberOfWorkers();
}
#else
#if _WIN32||_WIN64

static inline int DetectNumberOfWorkers() {
if (!number_of_workers) {
SYSTEM_INFO si;
GetSystemInfo(&si);
number_of_workers = static_cast(si.dwNumberOfProcessors);
}
return number_of_workers;
}

#elif __linux__

static inline int DetectNumberOfWorkers( void ) {
if (!number_of_workers) {
number_of_workers = get_nprocs();
}
return number_of_workers;
}

#elif __APPLE__

static inline int DetectNumberOfWorkers( void ) {
if (!number_of_workers) {
int name[2] = {CTL_HW, HW_AVAILCPU};
int ncpu;
size_t size = sizeof(ncpu);
sysctl( name, 2, &ncpu, &size, NULL, 0 );
number_of_workers = ncpu;
}
return number_of_workers;
}

#else

#error Unknown OS

#endif /* os kind */

#endif


My first attempt was to copy the __linux__ definition of the DetectNumberOfWorkers() function into #error Unknown OS section, and comment out the error message. In this case, it didn't work (it didn't compile). So, just to be able to move on, I hard-coded the number_of_workers variable to be 4 (since I'm doing this on my quad-core system). That's not how you'd do it if you were making a true port, mind you!

So, here's what my modified code segment looks like:


#else

//cygwin: use known setting (quad core)
static inline int DetectNumberOfWorkers( void ) {
//if (!number_of_workers) {
// number_of_workers = get_nprocs();
//}
number_of_workers = 4;
return number_of_workers;
}
//#error Unknown OS

#endif /* os kind */


Next followed similar changes in response to "Unknown OS" errors in src/tbb/itt_notify.cpp and src/tbb/cache_aligned_allocator.cpp. In both cases, I replaced the "Unknown OS" error line with the code that was defined for __linux__.

Issues of various kinds...

Next, my build produced an error in src/tbb/task.cpp:


src/tbb/task.cpp:400: error: invalid conversion from 'int*' to 'int32_t*'


The code in question is checking a set of multibyte characters for the string "GenuineIntel". This didn't seem critical to successful operation of TBB within the Cygwin environment, so I commented out the entire area of code that was producing the error and set the result to true:


//! True if running on genuine Intel hardware
static inline bool IsGenuineIntel() {
bool result = true;
#if defined(__TBB_cpuid)
char info[16];
char *genuine_string = "GenuntelineI";
//cygwin: comment out line that caused error; set result to true
// __TBB_x86_cpuid( reinterpret_cast(info), 0 );
// The multibyte chars below spell "GenuineIntel".
//if( info[1]=='uneG' && info[3]=='Ieni' && info[2]=='letn' ) {
// result = true;
//}
// for (int i = 4; i < 16; ++i) {
// if ( info[i] != genuine_string[i-4] ) {
// result = false;
// break;
// }
// }
result = true; //cygwin
#elif __TBB_ipf
result = true;
#else
result = false;
#endif
return result;
}


My next problem was related to build/version_info_linux.sh. I had created my cygwin.inc file using linux.inc as a template. version_info_linux.sh applies Linux commands to produce a version information string. Many scripting commands that are valid in Linux are not valid in Cygwin. So, I created a new file, build/version_info_cygwin.sh, and had my cygwin.inc file call that instead of calling the Linux version. I hard-coded variables as necessary simply to be able to move on quickly (again, this is a hack, not a port). Here's the core section of my version_info_cygwin.sh file:


#cygwin - based on version_info_linux.sh - many changes

echo "#define __TBB_VERSION_STRINGS \\"
#cygwin echo '"TBB: ' "BUILD_HOST\t\t"`hostname -s`" ("`arch`")"'" ENDL \'
echo '"TBB: ' "BUILD_HOST\t\tQUAD_CORE ("`arch`")"'" ENDL \'
echo '"TBB: ' "BUILD_OS\t\t"`head -1 /etc/issue | sed -e 's/\\\\//g'`'" ENDL \'
#cygwin echo '"TBB: ' "BUILD_KERNEL\t"`uname -rv`'" ENDL \'
echo '"TBB: ' "BUILD_KERNEL\t1.5.25"'" ENDL \'
echo '"TBB: ' "BUILD_GCC\t\t"`g++ -v &1 | grep 'gcc.*version'`'" ENDL \'
[ -z "$COMPILER_VERSION" ] || echo '"TBB: ' "BUILD_COMPILER\t"$COMPILER_VERSION'" ENDL \'
echo '"TBB: ' "BUILD_GLIBC\t2.3.5"'" ENDL \'
echo '"TBB: ' "BUILD_LD\t\t"`ld -v | grep 'version'`'" ENDL \'
echo '"TBB: ' "BUILD_TARGET\t$arch on $runtime"'" ENDL \'
echo '"TBB: ' "BUILD_COMMAND\t"$*'" ENDL \'
echo ""
echo "#define __TBB_DATETIME \""`date -u`"\""


Almost there!

My next execution of make brought me all the way to the ld process. There, I was told that the -lrt option was not valid. The fix for this was simple: I edited my cygwin.gcc.inc file and removed the -lrt option:


LIB_LINK_FLAGS = -shared
#cygwin LIBS = -lpthread -lrt -ldl
LIBS = -lpthread -ldl


The next error message was:


src/tbbmalloc/MemoryAllocator.cpp:339:
#error highestBitPos() not implemented for this platform


This was somewhat similar to the earlier "Unknown OS" errors. But in this case, the base TBB code defines instructions for __ARCH_unknown. I chose to implement this code for Cygwin, to avoid having to get into the ASSEMBLER instructions that are defined for other operating systems. So, this area of my MemoryAllocator.cpp looks like this:


static inline unsigned int highestBitPos(unsigned int number)
{
unsigned int pos;
#if __ARCH_x86_32||__ARCH_x86_64

# if __linux__||__APPLE__
__asm__ ("bsr %1,%0" : "=r"(pos) : "r"(number));
# elif (_WIN32 && (!_WIN64 || __INTEL_COMPILER))
__asm
{
bsr eax, number
mov pos, eax
}
# elif _WIN64 && _MSC_VER>=1400
_BitScanReverse((unsigned long*)&pos, (unsigned long)number);
# else
//cygwin
//# error highestBitPos() not implemented for this platform
static unsigned int bsr[16] = {0,6,7,7,8,8,8,8,9,9,9,9,9,9,9,9};
MALLOC_ASSERT( number>=64 && number>6 ];
# endif

#elif __ARCH_ipf || __ARCH_unknown
static unsigned int bsr[16] = {0,6,7,7,8,8,8,8,9,9,9,9,9,9,9,9};
MALLOC_ASSERT( number>=64 && number>6 ];
#else
# error highestBitPos() not implemented for this platform
#endif
return pos;
}


Success!

I re-executed make release, and it was apparently successful: release versions of libtbb.so and libtbbmalloc.so were created!

Next I tried running tbbvars.sh, then making the sub_string_finder "Getting Started" example problem. The TBB *.h files were not being found. I tried lots of different things, and it became apparent that the TBB shared object files I'd just created weren't being noticed by the sub_string_finder build process. No matter what I did with my path definitions, nothing worked.

Finally, I tried copying the shared object files to new names, with the file extension changed from .so to .dll. The make worked!

I ran sub_string_finder_extended and got the following results:


$ ./sub_string_finder_extended.exe
Done building string.
Done with serial version.
Done with parallel version.
Done validating results.
Serial version ran in 6.291 seconds
Parallel version ran in 1.627 seconds
Resulting in a speed up of 3.86663


TBB is working under Cygwin! I see a 3.86 speedup on my quad-core Windows system.

Not all of the TBB examples worked. For example, tacheon and other examples that involve graphics aren't working. This could be due to missing components in my Cygwin installation. I'm not too worried about that. These TBB example problems do work under my current Cygwin TBB build:

    • concurrent_hash_map/count_strings

    • GettingStarted/sub_string_finder

    • parallel_reduce/primes

    • parallel_while/parallel_preorder

    • pipeline/textfilter

    • task/tree_sum

    • test_all/fibonacci



Conclusion

I'm quite thrilled to have TBB built and running using GCC under Cygwin, such that it fully utilizes all of my system's four cores. Next up: MinGW (Minimalist GNU for Windows); and ultimately a return to my "Building Threading Building Blocks on UWIN" effort.

Kevin Farnham, O'Reilly Media TBB Open Source Community, Freenode IRC #tbb, TBB Mailing Lists

Download TBB

For more complete information about compiler optimizations, see our Optimization Notice.

Comments