A CUDA-based implementation of an improved SPH method on GPU