Use an explicit variable to wait for the GPU thread... rather than copying the conditions in the big while loop. I *think* this doesn't make a difference unless (a) there's a breakpoint and (b) the CPU thread gets really unlucky and observes the change to CPReadPointer but not the immediately following change to CPReadWriteDistance... but the difficulty of working that out demonstrates why the change is needed.