I'm experiencing a consistent performance issue with MKL random number generators. I'm generating a uniform distribution of 500 Million floating-point or double-precision floating-point numbers in 0.0 to 1.0 range. The first time the performance in N numbers/second, and the second and subsequent times the performance is either 2N or 3N numbers/second. With this many elements being generated, this doesn't seem like a caching issue, but some sort of CPU/core affinity issue.
Has anyone else experienced this? Any ideas? I hate to miss out on this kind of performance level.