1
0
Fork 0
stockfish/src/misc.h

114 lines
3.4 KiB
C
Raw Normal View History

2008-08-31 23:59:13 -06:00
/*
Stockfish, a UCI chess playing engine derived from Glaurung 2.1
Copyright (C) 2004-2008 Tord Romstad (Glaurung author)
Copyright (C) 2008-2015 Marco Costalba, Joona Kiiski, Tord Romstad
Copyright (C) 2015-2020 Marco Costalba, Joona Kiiski, Gary Linscott, Tord Romstad
2008-08-31 23:59:13 -06:00
Stockfish is free software: you can redistribute it and/or modify
2008-08-31 23:59:13 -06:00
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
Stockfish is distributed in the hope that it will be useful,
2008-08-31 23:59:13 -06:00
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
2008-08-31 23:59:13 -06:00
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
#ifndef MISC_H_INCLUDED
2008-08-31 23:59:13 -06:00
#define MISC_H_INCLUDED
#include <cassert>
#include <chrono>
#include <ostream>
2008-08-31 23:59:13 -06:00
#include <string>
#include <vector>
#include "types.h"
2008-08-31 23:59:13 -06:00
const std::string engine_info(bool to_uci = false);
void prefetch(void* addr);
void start_logger(const std::string& fname);
void dbg_hit_on(bool b);
void dbg_hit_on(bool c, bool b);
void dbg_mean_of(int v);
void dbg_print();
typedef std::chrono::milliseconds::rep TimePoint; // A value in milliseconds
static_assert(sizeof(TimePoint) == sizeof(int64_t), "TimePoint should be 64 bits");
inline TimePoint now() {
return std::chrono::duration_cast<std::chrono::milliseconds>
(std::chrono::steady_clock::now().time_since_epoch()).count();
}
template<class Entry, int Size>
struct HashTable {
Entry* operator[](Key key) { return &table[(uint32_t)key & (Size - 1)]; }
private:
std::vector<Entry> table = std::vector<Entry>(Size); // Allocate on the heap
};
enum SyncCout { IO_LOCK, IO_UNLOCK };
std::ostream& operator<<(std::ostream&, SyncCout);
#define sync_cout std::cout << IO_LOCK
#define sync_endl std::endl << IO_UNLOCK
Simpler PRNG and faster magics search This patch replaces RKISS by a simpler and faster PRNG, xorshift64* proposed by S. Vigna (2014). It is extremely simple, has a large enough period for Stockfish's needs (2^64), requires no warming-up (allowing such code to be removed), and offers slightly better randomness than MT19937. Paper: http://xorshift.di.unimi.it/ Reference source code (public domain): http://xorshift.di.unimi.it/xorshift64star.c The patch also simplifies how init_magics() searches for magics: - Old logic: seed the PRNG always with the same seed, then use optimized bit rotations to tailor the RNG sequence per rank. - New logic: seed the PRNG with an optimized seed per rank. This has two advantages: 1. Less code and less computation to perform during magics search (not ROTL). 2. More choices for random sequence tuning. The old logic only let us choose from 4096 bit rotation pairs. With the new one, we can look for the best seeds among 2^64 values. Indeed, the set of seeds[][] provided in the patch reduces the effort needed to find the magics: 64-bit SF: Old logic -> 5,783,789 rand64() calls needed to find the magics New logic -> 4,420,086 calls 32-bit SF: Old logic -> 2,175,518 calls New logic -> 1,895,955 calls In the 64-bit case, init_magics() take 25 ms less to complete (Intel Core i5). Finally, when playing with strength handicap, non-determinism is achieved by setting the seed of the static RNG only once. Afterwards, there is no need to skip output values. The bench only changes because the Zobrist keys are now different (since they are random numbers straight out of the PRNG). The RNG seed has been carefully chosen so that the resulting Zobrist keys are particularly well-behaved: 1. All triplets of XORed keys are unique, implying that it would take at least 7 keys to find a 64-bit collision (test suggested by ceebo) 2. All pairs of XORed keys are unique modulo 2^32 3. The cardinality of { (key1 ^ key2) >> 48 } is as close as possible to the maximum (65536) Point 2 aims at ensuring a good distribution among the bits that determine an TT entry's cluster, likewise point 3 among the bits that form the TT entry's key16 inside a cluster. Details: Bitset card(key1^key2) ------ --------------- RKISS key16 64894 = 99.020% of theoretical maximum low18 180117 = 99.293% low32 305362 = 99.997% Xorshift64*, old seed key16 64918 = 99.057% low18 179994 = 99.225% low32 305350 = 99.993% Xorshift64*, new seed key16 65027 = 99.223% low18 181118 = 99.845% low32 305371 = 100.000% Bench: 9324905 Resolves #148
2014-12-07 17:10:57 -07:00
/// xorshift64star Pseudo-Random Number Generator
/// This class is based on original code written and dedicated
/// to the public domain by Sebastiano Vigna (2014).
/// It has the following characteristics:
///
Simpler PRNG and faster magics search This patch replaces RKISS by a simpler and faster PRNG, xorshift64* proposed by S. Vigna (2014). It is extremely simple, has a large enough period for Stockfish's needs (2^64), requires no warming-up (allowing such code to be removed), and offers slightly better randomness than MT19937. Paper: http://xorshift.di.unimi.it/ Reference source code (public domain): http://xorshift.di.unimi.it/xorshift64star.c The patch also simplifies how init_magics() searches for magics: - Old logic: seed the PRNG always with the same seed, then use optimized bit rotations to tailor the RNG sequence per rank. - New logic: seed the PRNG with an optimized seed per rank. This has two advantages: 1. Less code and less computation to perform during magics search (not ROTL). 2. More choices for random sequence tuning. The old logic only let us choose from 4096 bit rotation pairs. With the new one, we can look for the best seeds among 2^64 values. Indeed, the set of seeds[][] provided in the patch reduces the effort needed to find the magics: 64-bit SF: Old logic -> 5,783,789 rand64() calls needed to find the magics New logic -> 4,420,086 calls 32-bit SF: Old logic -> 2,175,518 calls New logic -> 1,895,955 calls In the 64-bit case, init_magics() take 25 ms less to complete (Intel Core i5). Finally, when playing with strength handicap, non-determinism is achieved by setting the seed of the static RNG only once. Afterwards, there is no need to skip output values. The bench only changes because the Zobrist keys are now different (since they are random numbers straight out of the PRNG). The RNG seed has been carefully chosen so that the resulting Zobrist keys are particularly well-behaved: 1. All triplets of XORed keys are unique, implying that it would take at least 7 keys to find a 64-bit collision (test suggested by ceebo) 2. All pairs of XORed keys are unique modulo 2^32 3. The cardinality of { (key1 ^ key2) >> 48 } is as close as possible to the maximum (65536) Point 2 aims at ensuring a good distribution among the bits that determine an TT entry's cluster, likewise point 3 among the bits that form the TT entry's key16 inside a cluster. Details: Bitset card(key1^key2) ------ --------------- RKISS key16 64894 = 99.020% of theoretical maximum low18 180117 = 99.293% low32 305362 = 99.997% Xorshift64*, old seed key16 64918 = 99.057% low18 179994 = 99.225% low32 305350 = 99.993% Xorshift64*, new seed key16 65027 = 99.223% low18 181118 = 99.845% low32 305371 = 100.000% Bench: 9324905 Resolves #148
2014-12-07 17:10:57 -07:00
/// - Outputs 64-bit numbers
/// - Passes Dieharder and SmallCrush test batteries
/// - Does not require warm-up, no zeroland to escape
/// - Internal state is a single 64-bit integer
/// - Period is 2^64 - 1
/// - Speed: 1.60 ns/call (Core i7 @3.40GHz)
///
Simpler PRNG and faster magics search This patch replaces RKISS by a simpler and faster PRNG, xorshift64* proposed by S. Vigna (2014). It is extremely simple, has a large enough period for Stockfish's needs (2^64), requires no warming-up (allowing such code to be removed), and offers slightly better randomness than MT19937. Paper: http://xorshift.di.unimi.it/ Reference source code (public domain): http://xorshift.di.unimi.it/xorshift64star.c The patch also simplifies how init_magics() searches for magics: - Old logic: seed the PRNG always with the same seed, then use optimized bit rotations to tailor the RNG sequence per rank. - New logic: seed the PRNG with an optimized seed per rank. This has two advantages: 1. Less code and less computation to perform during magics search (not ROTL). 2. More choices for random sequence tuning. The old logic only let us choose from 4096 bit rotation pairs. With the new one, we can look for the best seeds among 2^64 values. Indeed, the set of seeds[][] provided in the patch reduces the effort needed to find the magics: 64-bit SF: Old logic -> 5,783,789 rand64() calls needed to find the magics New logic -> 4,420,086 calls 32-bit SF: Old logic -> 2,175,518 calls New logic -> 1,895,955 calls In the 64-bit case, init_magics() take 25 ms less to complete (Intel Core i5). Finally, when playing with strength handicap, non-determinism is achieved by setting the seed of the static RNG only once. Afterwards, there is no need to skip output values. The bench only changes because the Zobrist keys are now different (since they are random numbers straight out of the PRNG). The RNG seed has been carefully chosen so that the resulting Zobrist keys are particularly well-behaved: 1. All triplets of XORed keys are unique, implying that it would take at least 7 keys to find a 64-bit collision (test suggested by ceebo) 2. All pairs of XORed keys are unique modulo 2^32 3. The cardinality of { (key1 ^ key2) >> 48 } is as close as possible to the maximum (65536) Point 2 aims at ensuring a good distribution among the bits that determine an TT entry's cluster, likewise point 3 among the bits that form the TT entry's key16 inside a cluster. Details: Bitset card(key1^key2) ------ --------------- RKISS key16 64894 = 99.020% of theoretical maximum low18 180117 = 99.293% low32 305362 = 99.997% Xorshift64*, old seed key16 64918 = 99.057% low18 179994 = 99.225% low32 305350 = 99.993% Xorshift64*, new seed key16 65027 = 99.223% low18 181118 = 99.845% low32 305371 = 100.000% Bench: 9324905 Resolves #148
2014-12-07 17:10:57 -07:00
/// For further analysis see
/// <http://vigna.di.unimi.it/ftp/papers/xorshift.pdf>
class PRNG {
uint64_t s;
Simpler PRNG and faster magics search This patch replaces RKISS by a simpler and faster PRNG, xorshift64* proposed by S. Vigna (2014). It is extremely simple, has a large enough period for Stockfish's needs (2^64), requires no warming-up (allowing such code to be removed), and offers slightly better randomness than MT19937. Paper: http://xorshift.di.unimi.it/ Reference source code (public domain): http://xorshift.di.unimi.it/xorshift64star.c The patch also simplifies how init_magics() searches for magics: - Old logic: seed the PRNG always with the same seed, then use optimized bit rotations to tailor the RNG sequence per rank. - New logic: seed the PRNG with an optimized seed per rank. This has two advantages: 1. Less code and less computation to perform during magics search (not ROTL). 2. More choices for random sequence tuning. The old logic only let us choose from 4096 bit rotation pairs. With the new one, we can look for the best seeds among 2^64 values. Indeed, the set of seeds[][] provided in the patch reduces the effort needed to find the magics: 64-bit SF: Old logic -> 5,783,789 rand64() calls needed to find the magics New logic -> 4,420,086 calls 32-bit SF: Old logic -> 2,175,518 calls New logic -> 1,895,955 calls In the 64-bit case, init_magics() take 25 ms less to complete (Intel Core i5). Finally, when playing with strength handicap, non-determinism is achieved by setting the seed of the static RNG only once. Afterwards, there is no need to skip output values. The bench only changes because the Zobrist keys are now different (since they are random numbers straight out of the PRNG). The RNG seed has been carefully chosen so that the resulting Zobrist keys are particularly well-behaved: 1. All triplets of XORed keys are unique, implying that it would take at least 7 keys to find a 64-bit collision (test suggested by ceebo) 2. All pairs of XORed keys are unique modulo 2^32 3. The cardinality of { (key1 ^ key2) >> 48 } is as close as possible to the maximum (65536) Point 2 aims at ensuring a good distribution among the bits that determine an TT entry's cluster, likewise point 3 among the bits that form the TT entry's key16 inside a cluster. Details: Bitset card(key1^key2) ------ --------------- RKISS key16 64894 = 99.020% of theoretical maximum low18 180117 = 99.293% low32 305362 = 99.997% Xorshift64*, old seed key16 64918 = 99.057% low18 179994 = 99.225% low32 305350 = 99.993% Xorshift64*, new seed key16 65027 = 99.223% low18 181118 = 99.845% low32 305371 = 100.000% Bench: 9324905 Resolves #148
2014-12-07 17:10:57 -07:00
uint64_t rand64() {
s ^= s >> 12, s ^= s << 25, s ^= s >> 27;
return s * 2685821657736338717LL;
Simpler PRNG and faster magics search This patch replaces RKISS by a simpler and faster PRNG, xorshift64* proposed by S. Vigna (2014). It is extremely simple, has a large enough period for Stockfish's needs (2^64), requires no warming-up (allowing such code to be removed), and offers slightly better randomness than MT19937. Paper: http://xorshift.di.unimi.it/ Reference source code (public domain): http://xorshift.di.unimi.it/xorshift64star.c The patch also simplifies how init_magics() searches for magics: - Old logic: seed the PRNG always with the same seed, then use optimized bit rotations to tailor the RNG sequence per rank. - New logic: seed the PRNG with an optimized seed per rank. This has two advantages: 1. Less code and less computation to perform during magics search (not ROTL). 2. More choices for random sequence tuning. The old logic only let us choose from 4096 bit rotation pairs. With the new one, we can look for the best seeds among 2^64 values. Indeed, the set of seeds[][] provided in the patch reduces the effort needed to find the magics: 64-bit SF: Old logic -> 5,783,789 rand64() calls needed to find the magics New logic -> 4,420,086 calls 32-bit SF: Old logic -> 2,175,518 calls New logic -> 1,895,955 calls In the 64-bit case, init_magics() take 25 ms less to complete (Intel Core i5). Finally, when playing with strength handicap, non-determinism is achieved by setting the seed of the static RNG only once. Afterwards, there is no need to skip output values. The bench only changes because the Zobrist keys are now different (since they are random numbers straight out of the PRNG). The RNG seed has been carefully chosen so that the resulting Zobrist keys are particularly well-behaved: 1. All triplets of XORed keys are unique, implying that it would take at least 7 keys to find a 64-bit collision (test suggested by ceebo) 2. All pairs of XORed keys are unique modulo 2^32 3. The cardinality of { (key1 ^ key2) >> 48 } is as close as possible to the maximum (65536) Point 2 aims at ensuring a good distribution among the bits that determine an TT entry's cluster, likewise point 3 among the bits that form the TT entry's key16 inside a cluster. Details: Bitset card(key1^key2) ------ --------------- RKISS key16 64894 = 99.020% of theoretical maximum low18 180117 = 99.293% low32 305362 = 99.997% Xorshift64*, old seed key16 64918 = 99.057% low18 179994 = 99.225% low32 305350 = 99.993% Xorshift64*, new seed key16 65027 = 99.223% low18 181118 = 99.845% low32 305371 = 100.000% Bench: 9324905 Resolves #148
2014-12-07 17:10:57 -07:00
}
public:
PRNG(uint64_t seed) : s(seed) { assert(seed); }
Simpler PRNG and faster magics search This patch replaces RKISS by a simpler and faster PRNG, xorshift64* proposed by S. Vigna (2014). It is extremely simple, has a large enough period for Stockfish's needs (2^64), requires no warming-up (allowing such code to be removed), and offers slightly better randomness than MT19937. Paper: http://xorshift.di.unimi.it/ Reference source code (public domain): http://xorshift.di.unimi.it/xorshift64star.c The patch also simplifies how init_magics() searches for magics: - Old logic: seed the PRNG always with the same seed, then use optimized bit rotations to tailor the RNG sequence per rank. - New logic: seed the PRNG with an optimized seed per rank. This has two advantages: 1. Less code and less computation to perform during magics search (not ROTL). 2. More choices for random sequence tuning. The old logic only let us choose from 4096 bit rotation pairs. With the new one, we can look for the best seeds among 2^64 values. Indeed, the set of seeds[][] provided in the patch reduces the effort needed to find the magics: 64-bit SF: Old logic -> 5,783,789 rand64() calls needed to find the magics New logic -> 4,420,086 calls 32-bit SF: Old logic -> 2,175,518 calls New logic -> 1,895,955 calls In the 64-bit case, init_magics() take 25 ms less to complete (Intel Core i5). Finally, when playing with strength handicap, non-determinism is achieved by setting the seed of the static RNG only once. Afterwards, there is no need to skip output values. The bench only changes because the Zobrist keys are now different (since they are random numbers straight out of the PRNG). The RNG seed has been carefully chosen so that the resulting Zobrist keys are particularly well-behaved: 1. All triplets of XORed keys are unique, implying that it would take at least 7 keys to find a 64-bit collision (test suggested by ceebo) 2. All pairs of XORed keys are unique modulo 2^32 3. The cardinality of { (key1 ^ key2) >> 48 } is as close as possible to the maximum (65536) Point 2 aims at ensuring a good distribution among the bits that determine an TT entry's cluster, likewise point 3 among the bits that form the TT entry's key16 inside a cluster. Details: Bitset card(key1^key2) ------ --------------- RKISS key16 64894 = 99.020% of theoretical maximum low18 180117 = 99.293% low32 305362 = 99.997% Xorshift64*, old seed key16 64918 = 99.057% low18 179994 = 99.225% low32 305350 = 99.993% Xorshift64*, new seed key16 65027 = 99.223% low18 181118 = 99.845% low32 305371 = 100.000% Bench: 9324905 Resolves #148
2014-12-07 17:10:57 -07:00
template<typename T> T rand() { return T(rand64()); }
/// Special generator used to fast init magic numbers.
/// Output values only have 1/8th of their bits set on average.
template<typename T> T sparse_rand()
{ return T(rand64() & rand64() & rand64()); }
Simpler PRNG and faster magics search This patch replaces RKISS by a simpler and faster PRNG, xorshift64* proposed by S. Vigna (2014). It is extremely simple, has a large enough period for Stockfish's needs (2^64), requires no warming-up (allowing such code to be removed), and offers slightly better randomness than MT19937. Paper: http://xorshift.di.unimi.it/ Reference source code (public domain): http://xorshift.di.unimi.it/xorshift64star.c The patch also simplifies how init_magics() searches for magics: - Old logic: seed the PRNG always with the same seed, then use optimized bit rotations to tailor the RNG sequence per rank. - New logic: seed the PRNG with an optimized seed per rank. This has two advantages: 1. Less code and less computation to perform during magics search (not ROTL). 2. More choices for random sequence tuning. The old logic only let us choose from 4096 bit rotation pairs. With the new one, we can look for the best seeds among 2^64 values. Indeed, the set of seeds[][] provided in the patch reduces the effort needed to find the magics: 64-bit SF: Old logic -> 5,783,789 rand64() calls needed to find the magics New logic -> 4,420,086 calls 32-bit SF: Old logic -> 2,175,518 calls New logic -> 1,895,955 calls In the 64-bit case, init_magics() take 25 ms less to complete (Intel Core i5). Finally, when playing with strength handicap, non-determinism is achieved by setting the seed of the static RNG only once. Afterwards, there is no need to skip output values. The bench only changes because the Zobrist keys are now different (since they are random numbers straight out of the PRNG). The RNG seed has been carefully chosen so that the resulting Zobrist keys are particularly well-behaved: 1. All triplets of XORed keys are unique, implying that it would take at least 7 keys to find a 64-bit collision (test suggested by ceebo) 2. All pairs of XORed keys are unique modulo 2^32 3. The cardinality of { (key1 ^ key2) >> 48 } is as close as possible to the maximum (65536) Point 2 aims at ensuring a good distribution among the bits that determine an TT entry's cluster, likewise point 3 among the bits that form the TT entry's key16 inside a cluster. Details: Bitset card(key1^key2) ------ --------------- RKISS key16 64894 = 99.020% of theoretical maximum low18 180117 = 99.293% low32 305362 = 99.997% Xorshift64*, old seed key16 64918 = 99.057% low18 179994 = 99.225% low32 305350 = 99.993% Xorshift64*, new seed key16 65027 = 99.223% low18 181118 = 99.845% low32 305371 = 100.000% Bench: 9324905 Resolves #148
2014-12-07 17:10:57 -07:00
};
/// Under Windows it is not possible for a process to run on more than one
/// logical processor group. This usually means to be limited to use max 64
/// cores. To overcome this, some special platform specific API should be
/// called to set group affinity for each thread. Original code from Texel by
/// Peter Ă–sterlund.
namespace WinProcGroup {
void bindThisThread(size_t idx);
}
#endif // #ifndef MISC_H_INCLUDED