Identification of Wealthy Households from the Residential Property Price Index Database for Sample Selection for Household Surveys

By Evren Ceritoglu

5 years ago

Identification of Wealthy Households from the Residential Property Price Index Database for Sample Selection for Household Surveys

Sales

Organisation Tags (2)

Create FREE account or Login to add your comment

Comments (0)

Transcription

Identification of Wealthy Households from the Residential Property Price Index Database for Sample Selection for Household Surveys Evren CER İTOĞLU Özlem SEVİNÇ October 2020 Working Paper No: 20/10
© Central Bank of the Republic of Turkey 2020 Address: Central Bank of the Republic of Turkey Head Office Structural Economic Research Department Hacı Bayram Mh. İstiklal Caddesi No: 10 Ulus, 06050 Ankara, Turkey Phone: +90 312 507 80 04 Facsimile: +90 312 507 78 96 The views expressed in this working paper are those of the author(s) and do not necessarily represent the official views of the Central Bank of the Republic of Turkey.
Identification of Wealthy Households from the Residential Property Price Index Database for Sample Selection for Household Surveys * Evren Ceritoğlu a and Özlem Sevinç b Abstract This paper aims to identify wealthy households in Turkey for sample selection for household surveys. In the absence of income and wealth tax data, we analyze house prices from the Residential Property Price Index (RPPI), which is constructed by the Central Bank of the Republic of Turkey (CBRT) from dwelling appraisal reports to monitor price movements. RPPI is announced monthly by the CBRT for Turkey and 26 geographical regions at NUTS2 level since 2012, but data actually starts from January 2010. The RPPI database comprises more appraisal observations from İstanbul and western provinces, where house prices are significantly higher than country average. However, the number of appraisal observations is low for the Eastern provinces, since the number of house sales is limited in poor and small provinces. Moreover, the percentage of mortgaged house sales is even lower in these regions, whereas the RPPI database is based on dwelling appraisal reports on house sales, which are subject to mortgage loans. We examine unit house prices from the CBRT – RPPI database from 2010 to 2018 at province, district and neighborhood levels. Unit house prices are calculated by dividing the value (TL) to the gross usage area (m2) at current prices. Only neighborhoods with 30 or more observations are examined in the analysis. We discuss the validity of the hypothesis that there is a direct relationship between unit house prices and the number of home appraisals. We regress the natural logarithm of the number of home appraisals on the natural logarithm of unit house prices using mean values. We perform fixed effects regressions at both neighborhood and province levels using our unbalanced and balanced panel data sets. We control for year effects by introducing time dummy variables into the regressions. We find that there is a positive and statistically significant relationship between unit house prices and the number of home appraisals. Moreover, we perform the same regressions for We would like to thank the editor, the anonymous referee, Özgül Atılgan Ayanoğlu, Duygu Konukçu Çelik, Ezgi Deryol and Erdi Kızılkaya for their contribution to this paper. a Economist, Structural Economic Research Department, Central Bank of the Republic of Turkey (CBRT), Ulus / Ankara 06050 TURKEY Telephone no: + 90 312 5078024 Fax number: + 90 312 5075732 E-mail: evren.ceritoglu@tcmb.gov.tr b Assistant Specialist, Structural Economic Research Department, Central Bank of the Republic of Turkey (CBRT), Ulus / Ankara 06050 TURKEY Telephone no: + 90 312 5078053 Fax number: + 90 312 5075732 E-mail: ozlem.sevinc@tcmb.gov.tr * 1
neighborhoods that have more than 100 observations as a robustness check . We observe that the size and the sign of the regression coefficients do not change when we restrict our data set. The direction of the relationship might be from the number of home appraisals to unit house prices or it could be both ways. For that reason, as another robustness check, we regress the natural logarithm of unit house prices on the natural logarithm of the number of home appraisals. We observe that there is a statistically significant relationship between the number of home appraisals and unit house prices. However, the size of the regression coefficients is considerably lower in this case. As a result, our empirical analysis indicates that the number of observations is higher in administrative units, where house prices are higher. Therefore, we argue that identification of wealthy households according to their neighborhoods using the RPPI database is a reliable and consistent method for oversampling for household surveys in Turkey. Key words: Unit house prices, wealthy households, panel data, sampling design, oversampling JEL codes: C33, C83, R21, R31, R32 2
Non –Technical Summary This paper proposes a method to identify wealthy households for oversampling them on a neighborhood basis in household surveys in Turkey. In an ideal world, the results of a household survey represent the entire population. However, in practice, especially wealth-related surveys fail to interview with wealthy households. It is of great importance that they are represented in the sample in a balanced manner, since household assets and liabilities are mainly concentrated in the upper income groups. In such wealth-related surveys, an efficient application is to oversample wealthy households, which is about contacting proportionally more wealthy households in the surveys. Income and wealth tax data at the individual or household level is the most appropriate source of information to identify wealthy households. However, income and wealth tax are not available at individual or household level in Turkey. Moreover, there is no direct data related to the wealthiest people’s addresses including their provinces, districts and neighborhoods. Therefore, we analyze unit house prices for identifying wealthy households in Turkey, which is derived from the Central Bank of the Republic of Turkey (CBRT) – Residential Property Price Index (RPPI) database between 2010 and 2018 at province, district and neighborhood levels. One positive aspect of the RPPI database is that it is nationally representative, while it provides information about geographical regions at NUTS2 level. Another positive aspect of accessing house price information at the neighborhood level is obtaining a tool that can speak with the Turkish Institute of Statistics (TURKSTAT) sampling frame. Housing wealth is often the largest component of household wealth. Moreover, household income and housing wealth are directly related to each other. Therefore, we assume that wealthy families live in more expensive neighborhoods in this paper. We argue that identification of wealthy households according to their neighborhoods using the RPPI database is a reliable and consistent method for oversampling them in household surveys in Turkey. Unit house prices are calculated by dividing the value of the residence (TL) to the gross usage area (m2) at current prices. We demonstrate that the distribution of neighborhoods and provinces with respect to unit house prices is very similar to income distribution across country. We also find that there is a positive and statistically significant relationship between unit house prices and the number of house sales at neighborhood and province levels. Thus, the empirical analysis confirms that our hypothesis is valid for the Turkish economy. The implementation of this method will be considered as an innovation, since TURKSTAT has not previously conducted a sampling design, which enables oversampling of wealthy households in their surveys. 3
I . Introduction The aim of this paper is to develop a reliable method to identify wealthy households for sampling design for household surveys in Turkey. Wealthy households hold a larger share of financial assets and liabilities. Moreover, they own a higher variety of financial assets and liabilities (Causa et al., 2019). Similarly, Bertaut and Starr-McCluer (2002) states that ownership of variates of financial assets and liabilities increases with wealth except for credit card balances and some kinds of debt in the U.S. economy. They also argue that there is large gap between intensity of assets and liabilities over different wealth groups. While aiming a survey on assets and liabilities, it is necessary to approach as rich households as possible to accurately represent the complete distribution of wealth (Balestra and Tonkin 2018; Vermeulen, 2016 and 2018). In this context, it would be better to use information from administrative data to oversample households that are wealthy. Previous empirical literature suggests that income and wealth tax data at the individual or household level is the most appropriate source of information to identify wealthy households (Bricker et al., 2016). However, income and wealth tax are not available at the individual or household level in Turkey. For that reason, the sampling strategy must focus on finding a variable that will reflect household wealth in the most detailed level according to available resources. Moreover, it should be possible to match the selected proxy variable with the sampling frame of the Turkish Institute of Statistics (TURKSTAT). As a result, in the absence of tax data, we analyze house prices from the Residential Property Price Index (RPPI), which is constructed by the Central Bank of the Republic of Turkey (CBRT) from dwelling appraisal reports to monitor price movements in Turkey.1 One of the positive aspects of the RPPI database is that it is nationally representative and it provides information about geographical regions at NUTS2 level. Another positive aspect of accessing house price information at the neighborhood level is obtaining a tool that can speak with TURKSTAT's sampling frame.2 Housing wealth is often the largest component of household wealth. Moreover, household income and housing wealth are directly related to each other. For that reason, we assume that wealthy families live in more expensive neighborhoods. In particular, we discuss the validity of the hypothesis that there is a direct relationship between house prices and the number of house sales. We perform econometric tests using unit house prices and the number of appraisal reports from the RPPI database at both neighborhood and province levels using balanced and unbalanced panel data sets to test this 1 https://www.tcmb.gov.tr/wps/wcm/connect/EN/TCMB+EN/Main+Menu/Statistics/Real+Sector+Statistics/Residential+Property+Price+Index/ TURKSTAT is one of the exceptional institutions that has access to the addresses of households in Turkey and has the authority to provide these information for household surveys carried out by institutions other than them with official requests under certain conditions. 2 4
hypothesis . The establishment of such a relationship will indicate that the RPPI database is sufficient to identify wealthy households for sample selection for household surveys in Turkey. The main contribution of this paper is to show that unit house prices successfully predict the spatial distribution of income in Turkey and can be used for sample selection on a neighborhood basis. Accordingly, first we demonstrate that the distribution of neighborhoods and provinces with respect to unit house prices is very similar to income distribution across country, which is measured using both aggregate and micro-economic data. Second, we find that there is a positive and significant association between the number of home appraisals and unit house prices at both neighborhood and province levels by performing econometric estimations using balanced and unbalanced panel data sets. These empirical findings suggest that the number of home transactions are higher in wealthy neighborhoods. Thus, we can argue that the small number of observations in the RPPI database from poor regions is not a major obstacle in the mapping of wealthy neighborhoods. Third, we control for the roles of income per capita, housing supply and population growth in the relationship between the number of home appraisals and unit house prices at province level in both unbalanced and balanced panel data estimations as a robustness check. We confirm that the positive relationship between the number of home appraisals and unit house prices is robust to the inclusion of control variables in the empirical analysis. As a result, we conclude that unit house prices can also be used for oversampling of wealthy households according to the proposed method in this paper, considering the importance of real estate ownership in the distribution of household wealth. Finally, the implementation of this method will be considered as an innovation, since TURKSTAT has not previously conducted a sampling design, which enables oversampling of wealthy households in their surveys. The outline of the paper is as follows: Section II discusses the literature on sampling design in household surveys with a special emphasis on oversampling of wealthy households. Section III presents the theoretical background. Section IV provides a descriptive analysis of the RPPI database and section V presents the econometric results. Finally, section VI concludes this paper with a brief summary of our findings. II. Sampling Design In most of the countries coordinated by European Statistical Office (Eurostat), multistage stratified cluster sampling is adopted as the main sampling method for household surveys. In Turkey, TURKSTAT is also implementing this method successfully by using two stage stratified cluster sampling. In order to understand the sample selection, the method can be summarized as follows: A sampling frame, which includes information about all household addresses should be set and the coverage of 5
the population should be determined at the beginning . All households that live in Turkey are included in the TURKSTAT National Address Database (NAD) which is based on the Address Based Population Registry System, while the institutionalized population – individuals that live in dormitories, guesthouses, childcare centers, nursing homes, private hospitals, prisons and military barracks – is excluded from the sampling frame.3 NAD is updated every six months to take into account the situation of moving people from one address to another address. The sites like villages including less than 20 households which are at most 1% of the country population are not covered in the sampling frame. The sample design strata are described by geographic regions (NUTS) and area types (urban and rural). The type of used geographic region can be changed from one survey to another. For example, if the implicit strata are defined as NUTS3 (including 81 provinces) and urban-rural areas, there exists 162 strata. The urban and rural definition comes from the number of the residents that live in a site. In the first stage, the primary sampling units (PSU) consisting approximately 100 addresses called blocks are derived from sampling frame. While PSU’s are formed in the sites having municipality, villages become PSU’s by themselves. The primary sampling units are selected by probability proportional to size (PPS) method by systematic sampling of PSU’s ordered by geographical level (NUTS1, NUTS2, Province, District, etc.).4 Then, final sampling units that are called households are selected systematically from primary sampling units.5 The block system that TURKSTAT applies is very good compared with other countries, since the blocks are very small, which is a necessary and useful property that raises the efficiency of the probability sample selection stage. All major TURKSTAT household surveys including Household Budget Survey (HBS), Household Labor Force Survey (LFS) and Survey on Income and Living Conditions (SILC) have the same sampling methodologies as mentioned. In an ideal world, the results of a survey study represent the entire population. On the other hand, in practice, especially wealth related surveys fail to interview with the wealthy households. Wealthy households are less willing to participate in the surveys. It is of great importance that they are represented in the sample in a balanced manner and that they give correct answers to the questions asked, since household assets and liabilities are mainly concentrated in the upper income group (Causa et al., 2019). We observe that the dispersion of household disposable income is significantly larger in upper income groups compared to lower income groups (Figure 1). The high degree of dispersion makes it difficult to estimate mean and median levels correctly in upper income groups. A randomly selected observation is closer to mean and median values in lower income groups, whereas a randomly selected observation could be significantly different from mean and median values in upper income groups. In http://www.turkstat.gov.tr/PreTablo.do?alt_id=1059 Please see Appendix 1 for more information on geographical distribution. 5 http://www.turkstat.gov.tr/UstMenu.do?metod=metabilgi 3 4 6
addition to that , non-response rate is generally higher among upper income groups compared to lower income groups in household surveys. As a result, at the sampling design stage, we need to select more observations from upper income groups to reach unbiased estimates not only for these groups, but also for whole population.6 Figure 1 – The Highest Annual Equivalised Household Disposable Income for Cumulative Percentage Income Groups * (Current Prices, TL) 10% 50,000 25% 50% 75% 90% 45,000 40,000 35,000 30,000 25,000 20,000 15,000 10,000 5,000 0 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 Source: TURKSTAT SILC * Household disposable income is adjusted for family size and intra-household resource allocation between adults and children. In wealth related surveys, an efficient application is to oversample wealthy households, which is about contacting proportionally more wealthy households in the surveys (Chakraborty and Waltl, 2018). Moreover, in order for the sample to represent a consistent distribution of wealth in the population, it is important to have a higher proportion of wealthy households in the sample than the normal sample distribution (Kennickell, 2008). This approach will lead to more observations in a certain part of the distribution than calculated from the original sampling frame. The oversampling method can also be used for finding rare sub-populations (Kalton, 2009) and finding hard to reach segments of population such as homeless persons, drug users, victims of female circumcisions (Marpsat and Razafindratsima, 2010) beside reaching wealthy households. It is necessary to consider family size and intra-household resource allocation in calculating income distribution indicators such as Gini coefficient and poverty line. TURKSTAT uses OECD equivalence measure in all household surveys. OECD equivalence assumes the value of 1 for the reference person in the household, 0.5 for household members, who are 14 and older, and 0.3 for household members, who are younger than 14. Household disposable income is divided by OECD equivalence scale. Thus, it becomes possible to compare households with different sizes and types with each other. 6 7
In this case , the most accurate way seems to include the wealthiest people more than usual in the sample. Valliant et al. (2014) studied the use of artificial variables based on commercial sources for sampling as stratification to reach sub-groups in the population. There are many other ways in order to find the wealthiest individuals. Wealth and income tax data are just a few of these ways, which are most reliable and appropriate sources. In Europe, HFCS (Household Finance and Consumption Survey) coordinated by the European Central Bank (ECB) has been practiced for a while, where the wealthy people are included more with the help of oversampling method.7 The hardest part of finding the wealthy with the oversampling method is the necessity of having current information representing the entire population in the sampling frame. The more the variable used for the oversampling method and the stronger the relationship between the variables and the wealth, the more successful the results of the method at the end of the application. Countries are trying to apply the oversampling method within the framework of their data, which helps them to find wealthy people. Spain and France use personal taxable wealth data, which is the best indicator of wealth while Estonia, Latvia, Luxembourg and Finland use personal income as an indicator of wealth. Housing price is another indicator of wealth, which used by Belgium, Germany and Greece while Poland and Portugal use the property size for identifying the wealth. Household Finance and Consumption Network (HFCN, 2016a and 2016b) report that electricity consumption, regional income, personal education and labor status are some of the ways of finding wealthy households. In Turkey, there is no direct data related to wealthiest people’s addresses including their provinces, districts and neighborhoods, which are existing administrative units. On the other hand, it is stated that personal wealth tax data is the best indicator of wealth (Bricker et al., 2016). Even if any variable exists that determines the wealth level of individuals in Turkey, there is no sampling frame to match this variable by individuals. The TURKSTAT sampling frame is based on household addresses rather than individuals. For this reason, the necessity of finding a variable, which indicates the wealthy, has arisen over the administrative units such as provinces, districts and neighborhoods in order to match with the sampling frame provided by TURKSTAT. Administrative data that indicate wealthy households through administrative units have been examined in detail, taking into account similar country examples. It has been decided that unit house price is the most suitable indicator for identifying wealthy households, which is included in the CBRT – RPPI database from 2010 to 2018 at province, district and neighborhood level. Housing wealth is often 7 https://www.ecb.europa.eu/pub/economic-research/research-networks/html/researcher_hfcn.en.html 8
the largest component of household wealth . For this reason, we assume that wealthy families live in more expensive neighborhoods in this paper. III. Theoretical Background From a theoretical point of view, we expect to find a positive relationship between unit house prices and the number of home appraisals under the assumption that housing supply is constant (Knoll et al., 2017). However, we expect that the number of home appraisals will increase as unit house prices increase up to a certain point, but the number of home appraisals will begin to fall when unit house prices exceed a critical point. We think that there will be fewer observations for the wealthiest households for several reasons. The number of houses for sale may be lower in the most expensive neighborhoods. Moreover, the wealthiest households are less likely to apply for a housing loan for a home purchase. Thus, we predict that this relationship will have a concave shape (Figure 2). Figure 2 – The Relationship between Unit House Prices and the Number of Home Appraisals House prices and house sales are expected to be higher in residential areas where housing demand is strong (Rosen and Smith, 1983; Riddel, 2004; Steiner, 2010). Previous empirical literature shows that there is a positive and significant relationship between housing demand and household permanent income (Goodman, 1998 and 1990; Zabel, 2004). Moreover, the housing market is an aggregation of many local housing markets, which necessitates that the empirical analysis is carried out for smaller administrative units (Kiel and Zabel, 2008). 9
Similarly, previous studies on the Turkish economy find a positive and significant relationship between housing demand and household permanent income (Halicioglu, 2007; Ceritoğlu, 2017 and 2020). In particular, Ceritoğlu (2017) finds that house price changes have a positive and significant effect on the growth of cohort consumption in Turkey. He constructs a pseudo-panel data set using birth-year cohorts from twelve consecutive waves of HBS between 2003 and 2014. According to his findings, homeowners perceive their housing wealth higher as house prices rise, which affects their consumption decisions positively. Thus, his empirical findings support the wealth channel argument in explaining the relationship between house prices and household consumption. Moreover, Ceritoğlu (2020) estimates that the permanent income elasticity of housing demand is approximately 26% analyzing fourteen consecutive waves of HBS from 2003 to 2016. In the case of Turkey, there was excess supply in the housing market across country and in three major provinces throughout the period of analysis.8 In the equations unit house prices and the number of home appraisals are denoted by U and A, respectively. Neighborhood or province is shown by i, while year is shown by t in the equations from (1) to (4).