You are here

Properties of the endogenous post-stratified estimator using a random forests model

Posted date: February 06, 2013
Publication Year: 
2012
Authors: Tipton, John; Opsomer, Jean; Moisen, Gretchen
Publication Series: 
Paper (invited, offered, keynote)
Source: In: Morin, Randall S.; Liknes, Greg C., comps. Moving from status to trends: Forest Inventory and Analysis (FIA) symposium 2012; 2012 December 4-6; Baltimore, MD. Gen. Tech. Rep. NRS-P-105. Newtown Square, PA: U.S. Department of Agriculture, Forest Service, Northern Research Station. [CD-ROM]: 348-351.
Note: This article is part of a larger document.

Abstract

Post-stratification is used in survey statistics as a method to improve variance estimates. In traditional post-stratification methods, the variable on which the data is being stratified must be known at the population level. In many cases this is not possible, but it is possible to use a model to predict values using covariates, and then stratify on these predicted values. This method is called endogenous post-stratification estimation (EPSE). In this paper, we investigate methods to automatically select the number of post-strata for EPSE. We do this in the context of models fitted by Random Forests with the stratum boundaries set at quantiles of the predicted distribution.

Citation

Tipton, John; Opsomer, Jean; Moisen, Gretchen G. 2012. Properties of the endogenous post-stratified estimator using a random forests model. In: Morin, Randall S.; Liknes, Greg C., comps. Moving from status to trends: Forest Inventory and Analysis (FIA) symposium 2012; 2012 December 4-6; Baltimore, MD. Gen. Tech. Rep. NRS-P-105. Newtown Square, PA: U.S. Department of Agriculture, Forest Service, Northern Research Station. [CD-ROM]: 348-351.