Final info about keysearch file format
Hi all,
I read tons of information about file format but I don't find an answer. Correct me if I'm wrong but in that contest, we don't count the input database file reading. We only count the "keysearch" file reading, the search, and the "result" file outputing.
My question: is the keysearch.txt file a "windows like" or "linux like" file? I mean, is each file line exactly 17 bytes (15 char key + 0x0d + 0x0a) or 16bytes (15 char key + 0x0d)?
So if I want to do windows contest submission, could I rely on a 16 or 17 bytes per line in the "keysearch.txt" file?
Thanks in advance Arnaud
| |
Re: Final info about keysearch file format
Hi all,
I read tons of information about file format but I don't find an answer. Correct me if I'm wrong but in that contest, we don't count the input database file reading. We only count the "keysearch" file reading, the search, and the "result" file outputing.
My question: is the keysearch.txt file a "windows like" or "linux like" file? I mean, is each file line exactly 17 bytes (15 char key + 0x0d + 0x0a) or 16bytes (15 char key + 0x0d)?
So if I want to do windows contest submission, could I rely on a 16 or 17 bytes per line in the "keysearch.txt" file?
Thanks in advance Arnaud
You should assume that the input data files were generated with a text editor or a program compiled and run on the target OS. For a windows submission, I guess that would mean 17 bytes per line.
--clay
| |
Re: Final info about keysearch file format
You should assume that the input data files were generated with a text editor or a program compiled and run on the target OS. For a windows submission, I guess that would mean 17 bytes per line.
--clay
Thanks Clay. So now it's sure linux entries will have a great advantage on windows entry (because of 16 bytes per line). In this challenge, if the files are big (and they should if you want to do "almost" 10 seconds processing), then the reading and writing result file is not small. (about the same time as the search in my first try, with a test file of 1000000 entry and 300000 search keys).
Too bad linux competitors will have such a great advantage over windows one! :-( (they can read the keysearch file with some large binary chunk and just patch "0" byte at the end of each string)
| |
Re: Final info about keysearch file format
Thanks Clay. So now it's sure linux entries will have a great advantage on windows entry (because of 16 bytes per line). In this challenge, if the files are big (and they should if you want to do "almost" 10 seconds processing), then the reading and writing result file is not small. (about the same time as the search in my first try, with a test file of 1000000 entry and 300000 search keys).
Too bad linux competitors will have such a great advantage over windows one! :-( (they can read the keysearch file with some large binary chunk and just patch "0" byte at the end of each string)
Please correct me if I'm wrong. Here's the way I understand this: What you're saying is that linux users can read the input file into memory in large chunks, thus, cutting down on the input time, then replace every 16th byte with ' ' and use a formula like i*16 to get the ith search string. But I fail to see why you can't do the same on a windows machine using i*17 instead...
| |
Re: Final info about keysearch file format
Please correct me if I'm wrong. Here's the way I understand this: What you're saying is that linux users can read the input file into memory in large chunks, thus, cutting down on the input time, then replace every 16th byte with ' ' and use a formula like i*16 to get the ith search string. But I fail to see why you can't do the same on a windows machine using i*17 instead...
Well, 16 will easily align into a cache line since it is a multiple of 4 (or 8), while 17 would end up acrossing two cache lines... I guess.
| |
Re: Final info about keysearch file format
Please correct me if I'm wrong. Here's the way I understand this: What you're saying is that linux users can read the input file into memory in large chunks, thus, cutting down on the input time, then replace every 16th byte with ' ' and use a formula like i*16 to get the ith search string. But I fail to see why you can't do the same on a windows machine using i*17 instead...
I think you're right^_^
| |
Re: Final info about keysearch file format
Please correct me if I'm wrong. Here's the way I understand this: What you're saying is that linux users can read the input file into memory in large chunks, thus, cutting down on the input time, then replace every 16th byte with ' ' and use a formula like i*16 to get the ith search string. But I fail to see why you can't do the same on a windows machine using i*17 instead...
For windows users input files will be slightly bigger - additional 64MB for 1GB input file. Also linux users may use single MOVNTDQ to read input data, while windows users will have to use 2 unaligned 64-bit MOVs.
---------------------------------------------
All about lock-free algorithms, multicore, scalability, parallel computing and related topics:
www.1024cores.net | |
Re: Final info about keysearch file format
For windows users input files will be slightly bigger - additional 64MB for 1GB input file. Also linux users may use single MOVNTDQ to read input data, while windows users will have to use 2 unaligned 64-bit MOVs.
Appreciate the response Dmitriy. Although, I'm afraid, that's ASM lingo that I don't quite understand (yet)... :)
Goes to show I still have a long way to go when it comes to optimizing code. I'll definitely google MOVNTDQ and respond if I can figure it out.
This competition is turning out to be quite a learning experience. :)
| |
Re: Final info about keysearch file format
Appreciate the response Dmitriy. Although, I'm afraid, that's ASM lingo that I don't quite understand (yet)... :)
Goes to show I still have a long way to go when it comes to optimizing code. I'll definitely google MOVNTDQ and respond if I can figure it out.
This competition is turning out to be quite a learning experience. :)
If I understand this right, MOVNTDQ is used to move 16 bytes from memory to an XMM register and vice-versa. I guess this would come into the picture while performing comparisons (please feel free to correct me). Theoretically, this should be possible irrespective of the operating system so long as the compiler supports SSE2. On windows, even if a line is 17 bytes long, the strings will still be 16 bytes long (including the null terminating character).
My question, I guess, is why would a windows based solution require 2 instructions? Does the vc++ compiler not support SSE2?
| |
Re: Final info about keysearch file format
If I understand this right, MOVNTDQ is used to move 16 bytes from memory to an XMM register and vice-versa. I guess this would come into the picture while performing comparisons (please feel free to correct me). Theoretically, this should be possible irrespective of the operating system so long as the compiler supports SSE2. On windows, even if a line is 17 bytes long, the strings will still be 16 bytes long (including the null terminating character).
My question, I guess, is why would a windows based solution require 2 instructions? Does the vc++ compiler not support SSE2?
MSVC does support SSE2. And there is always an option to use asm, however MSVC does not support asm in 64-bit mode, though ICC does support. But that's not the point, the point is that aligned moves are always faster, and on Windows platform the data is not aligned.
---------------------------------------------
All about lock-free algorithms, multicore, scalability, parallel computing and related topics:
www.1024cores.net | |
Re: Final info about keysearch file format
the point is that aligned moves are always faster, and on Windows platform the data is not aligned.
Hi Dmitriv. Totatlly agree with you, aligned data are faster (and so there is two different opcodes, load aligned and load unaligned).
Btw did you read a paper about Core I7, saying that now (on core i7) unaligned read is as fast as aligned read. I read this on a kind of "marketing" powerpoint. As I read somewhere else that thread-contest entries are run on a coreI7 machine, maybe the 17bytes windows alignment is not a problem.
What's your opinion? Do you think it's only "marketing" stuff (i can't believe un-aligned is as fast as aligned myself)
Arnaud
| | |