This paper presents a fast and accurate tsunami real-time operational model to compute across ocean-wide simulations completely on GPU (graphics processing unit). The spherical shallow water equations are solved using the method of characteristics and upwind cubic interpolation to provide high accuracy and stability. A customized, user interactive, tree-based mesh-refinement method is implemented based on distance from the coast and focal areas to generate a memory-efficient domain with resolutions of up to 50m. Three specialized and optimized GPU kernels (Wet, Wall and Inundation) are developed to compute the domain block mesh. Multi-GPU is used to further speed up the computation, and a weighted Hilbert space-filling curve is used to produce a balanced workload. Hindcasting of the 2004 Indonesian tsunami is presented to validate and compare the agreement of the arrival times and main peaks at several gauges. Inundation maps are also produced for Kamala and Hambantota to validate the accuracy of our model. Test runs on three Tesla P100 cards on Tsubame 3.0 could fully simulate 10h in just under 10min wall-clock time.