concurrent_hash_map() -- sorted insertion, sorting and thread-safe iteration

Igor Levicki
Total Points:
10,865
Status Points:
10,865
Black Belt
July 4, 2009 9:06 PM PDT
Rate
 
#2 Reply to #1
Hello, thanks for taking the time to reply.

However, I am not sure I understand what you mean.

1. Why are you comparing C++ map to the TBB hash_map when C++ also has a hash_map?

2. As far as I know C++ hash_map can be kept sorted or at least I have managed to do it with Microsoft implementation when I was experimenting with it. Perhaps that was an error or a pure luck?

As for the example, lets say we have a key container:

typedef struct _Key {
	wchar_t		FileName[MAX_PATH];
	DWORD		NameLength;
} Key;

And a value container:

typedef struct _Value {
	ULONGLONG	FileSize;
	FILETIME	FileTime;
	BOOL		Selected;
} Value;

Now wrap those in classes:

class	CKey
{
public:
	CKey(const wchar_t *FileName)
	{
		wcscpy_s(_Data.FileName, MAX_PATH, FileName);
		_Data.NameLength = wcslen(_Data.FileName);
	}

	~CKey(void)
	{
	}

	// operator < supporting natural sort order
	bool operator<(const CKey& x) const
	{
		return (StrCmpLogicalW(_Data.FileName, x._Data.FileName) < 0);
	}

	friend wostream& operator<<(wostream& o, const CKey& x)
	{
		o << L"FileName     : " << x._Data.FileName << endl;
		return o;
	}

	Key	_Data;
};
And the value as well:

class	CValue
{
public:
	CValue(void)
	{
	}

	~CValue(void)
	{
	}

	friend wostream& operator<<(wostream& o, const CValue& x)
{
FILETIME lt;
SYSTEMTIME st;

FileTimeToLocalFileTime(&x._Data.FileTime, &lt);
FileTimeToSystemTime(&lt, &st);

o << L"FileTime : " << st.wYear << L"-" << st.wMonth << L"-" << st.wDay << L" " << st.wHour << L":" << st.wMinute << L":" << st.wSecond << L"." << st.wMilliseconds << endl;
o << L"FileSize : " << x._Data.FileSize << endl;
o << L"Selected : ";
if (x._Data.Selected) {
o << L"TRUE";
} else {
o << L"FALSE";
}
o << endl;
return o;
}

private: Value _Data; };
I have left out CValue methods for the sake of brevity. Now we define types for the list:

typedef hash_map<const CKey, CValue> TList;
typedef hash_map<const CKey, CValue>::iterator TIter;
typedef pair<const CKey, CValue> TPair;
typedef pair TInsert;
And extend stdext with our custom hash function:

namespace stdext {
	// (One-at-a-Time Hash method by Bob Jenkins)
	size_t hash_value(const CKey& x)
	{
		size_t h = 0;

		for (size_t i = 0; i < x._Data.NameLength; i++) {
			h += x._Data.FileName[i];
			h += (h << 10);
			h ^= (h >> 6);
		}

		h += (h << 3);
		h ^= (h >> 11);
		h += (h << 15);

		return h;
	}
};
Lets say that we have a list class:

class CImageList
{
public:
	CImageList(void)
	{
	}

	~CImageList(void)
	{
	}

	BOOL Add(const wchar_t *FileName)
	{
		CKey	key(FileName);
		CValue	val;
		TInsert	RetVal;

		val.SetSelected(TRUE);
		// insert does not invalidate iterators!
		RetVal = _List.insert(TPair(key, val));

		return RetVal.second;
	}

	void Dump(void)
	{
		ENTER_CS;
		wcout << L"BEGIN" << endl;
		for (TIter i = _List.begin(); i != _List.end(); i++) {
			wcout << i->first;
			wcout << i->second;
		}
		wcout << L"END" << endl;
		LEAVE_CS;
	}
private:
	TList			_List;
};
Again, I left out some of the methods, we just need add to check. Then we use the list class:

	CImageList	all;

	all.Add(L"z100.jpg");
	all.Add(L"z2.jpg");
	all.Add(L"z1.jpg");
	all.Dump();
This worked for me -- it somehow magically sorted the elements on insertion so I got the output:

z1
z2
z100

when iterating.

Now my questions:

1. Do you have any ideas if what I did was by accident, or it really works that way?
2. How to make this code thread-safe except by rolling my own locking?

Finally, what you said about users being slow is true -- what is fast is the file adding/removing part which is done in another thread by watching a folder which being browsed by the user. Files can be added by a download manager, archiver (on unpacking) and removed by user (program allows to select and delete multiple files from the list). Lets say that I need fast add/remove for up to 10,000 elements, naturally sorted list and sequential iterators and all of it should be thread-safe. If there is a better way to accomplish this than the hash_map() I would be very glad to learn how.


--------
If you find my post helpfull, please rate it and/or select it as a best answer where applies. Thank you.


Intel Software Network Forums Statistics

8489 users have contributed to 31627 threads and 100761 posts to date.
In the past 24 hours, we have 33 new thread(s) 145 new posts(s), and 197 new user(s).

In the past 3 days, the most popular thread for everyone has been gemm(A,A,A) like possible? The most posts were made to Crash when loading skeleton The post with the most views is Dear Steve, excuse me for a d

Please welcome our newest member chat1983