lunes, 9 de mayo de 2016

Writing Multi-Threading Code in C++ 11

Writing Multi-Threading Code in C++ 11 

  09/05/2016 Rober Villar

Developers have been writing threading code for years using Posix threads (a.k.a. Boost Threads). Posix is an older but problematic tool. For example, it has thread cancellation, which can cause a problem with std::thread, especially if you practice (Resource Acquisition is Initialization) when acquiring resources during object creation and deallocating during destructors. But there is an alternative. Here’s a look at adding multi-threading code in C++ 11 using the library.
I created a simple application: a program that evaluates five-card poker hands. The hands are made up of cards, each with two characters, with rank and suit like KC (King Clubs), 2H (Two Hearts) and so on. This is actually a well-defined problem; there is a very efficient and fast poker scorer out there. I wanted to see if a multi-threaded version, derived from a single-threaded version, would be significantly faster.
Evaluating Poker Hands
Poker hands range from the best, which is a Straight Flush, to the poorest, a high card. In between are pairs, two pairs, threes, straights (a run of sequential ranks, e.g. 5-6-7-8-9), a flush (all five cards are the same suit), a full house (a three of one rank plus a pair), four of the same rank (a.k.a. four of a kind), and the straight flush (a straight with all cards the same suit).
For the purposes of this application, a text file of 1 million randomly generated five-card hands was created. Each line has a hand of five cards, i.e. 10 chars long in the form RSRSRSRSRS, where “R” is Rank and “S” suit. For example: AS7D8S7HJD (ace of spades, seven of diamonds, eight of spades, seven of hearts and jack of diamonds). It should be evaluated as a pair, because there are two sevens.
The first five hands are this:
So, the program has to read all 10,000 lines and evaluate as fast as possible, and output something like that above. To make sure the evaluation worked correctly, I ran it again using these hands below, and output the results from it:
Next, we have the results with all hands sorted. For straights, the code has to do a little bit of finagling. First, I check it with aces regarded as low, then move it to the high position:
C++ 11
I’ve tried to use a few C++ 11 features. For instance, instead of an array, I used std::array for the new collection class. Also, the enums were the newer-class enum types. These require static_casts to convert from a size_t (an int) to the enumerate type. Also, I used the “for (auto variable : collection)” loops to keep the code simpler. If you need to modify the underlying object, make it use “auto& like this”, which sets the selected flag false on each card in a hand.
Implementing the Evaluator
I created a PokerCard class to hold a card, then a PokerHand object, which has a std::array<PokerCard,5> and populated the hand, card by card, from the string. I’ve added a friend method to the PokerHand class to help evaluate it. Plus, once the hand has been read in, I use a custom sort on the cards by rank (A, 2…K). This simplifies evaluating the hands.
Checking for two pairs and full house (three of one rank plus a pair), I first searched for pairs, threes and fours and set a flag on each card. If there was a pair or three, I scanned the unflagged cards to see if a pair or three was present. I used a pointer to the previous card so if a pair was found, both cards could have their selected flag set true.
The release version of the single-threaded code took 3.35 seconds for the million hands, or approximately 3.35 microseconds for each evaluation. That is not bad, and that included reading the almost-12 MB file line by line.
Could I improve upon it? Yes. This is the code in main() that would need changing:
Going Multithreaded
I’d made the friend function EvaluateHand() completely self contained. It’s monolithic, about 160 lines long, or roughly half the program. But I figured doing it that way might make it easier to call it multi-threaded. All it needs is a reference to an instance of a Poker Hand.
The easiest way (and most likely to succeed) is to use std::async and fire off 11 valuations at a time, maybe one per thread. With async, you don’t decide what threads it uses. But behind the scenes, it apparently uses a thread pool. This is the replacement code below. I’ve used a compiler #define for the symbol Multi. If you comment the #define line out and save the file, you get the single-threaded version. Leave it in for the Multitasking version.
It uses futures to hold the result of the valuation. A future is an object that waits for your asynchronously running code to complete and store the result. As it’s asynchronous, you have to call get() on every future to wait until the evaluation is done. So in theory, it sends off 12 valuations at a go. This is in the code to the right of futures[count++] =. It’s defined as a lambda (an anonymous function) and [str] is how you pass the string str in to it. MaxThreads is a constant, defined as 12, so I use 11 of the threads (one is needed for the main thread):
The value of MaxThreads was obtained by calling this in an earlier version:
Did it Work?
Despite considerable effort, this, at 4.33 seconds, actually took about 33 percent longer than the single-threaded version. My thought was that the evaluation execution time is so quick that the overhead of calling the tasks asynchronously just makes it slower.
To try out this theory, I added a Sleep(1), a one-millisecond delay to the EvaluateHand() function. I also dropped the number of hands to evaluate to just under 20,000, as a million would have taken too long. I then ran both versions. This time, the multi-tasking version took 3.25 seconds; the single version took about 31 seconds.
So the theory for speeding it up was sound, but my evaluation function was too fast!
I’ve uploaded the VC++ 15 project and source files, plus test files including the million hands, to GitHub .

No hay comentarios:

Publicar un comentario