In this paper, a two-stage block hypothesis testing following the idea of Fan, Lin and Cheng (2004) is proposed for massive data regression analysis. Variables selection criteria incorporating with classical stepwise procedure are also developed to select significant explanatory variables. Simulation study confirms that our approach is more accurate in the sense of achieving the nominal significance level for huge data sets. Real data example also verifies that the proposed procedure is accurate compared with the classical method.
Tsai-Hung Fan, Dennis K. J. Lin, Kuang-Fu Cheng