### Hierarchical Generalized Linear Mixed Models for Multilevel Analysis of Indonesian Student’s PISA Mathematics Literacy Achievement

- Research Article in Proceedings of the 1st International Conference on Statistics and Analytics, ICSA 2019, 2-3 August 2019, Bogor, Indonesia
- Authors:
- Tonah Tonah, Anang Kurnia, Kusman Sadik
- Abstract:
Generally, learning assessment and evaluation data in educational has a hierarchical structures one of which is PISA data. Multilevel models are methods that can be used to analyse hierarchical data structures and can be considered as HGLM models. This study has two objectives namely, examine the …

more »Generally, learning assessment and evaluation data in educational has a hierarchical structures one of which is PISA data. Multilevel models are methods that can be used to analyse hierarchical data structures and can be considered as HGLM models. This study has two objectives namely, examine the distribution of variable mathematical literacy and selecting the best HGLM model to determine student and school level variables that significantly influence students' mathematical literacy achievement. The result we have obtained are mathematical literacy achievement has lognormal distribution and M7 model is the best model.

### Unordered Features Selection of Low Birth WeightDatain Indonesiausing the LASSO and Fused LASSO Techniques

- Research Article in Proceedings of the 1st International Conference on Statistics and Analytics, ICSA 2019, 2-3 August 2019, Bogor, Indonesia
- Authors:
- Yenni Kurniawati, Khairil Anwar Notodiputro, Bagus Sartono
- Abstract:
This paper aims to analyze the Low Birth Weight (LBW) data of infants in Indonesia by using the LASSO and Fused LASSO techniques. Fused LASSO is usually used to select parameters for ordered features. In this case, the features are unordered. Therefore, this research adopts three techniques in orde…

more »This paper aims to analyze the Low Birth Weight (LBW) data of infants in Indonesia by using the LASSO and Fused LASSO techniques. Fused LASSO is usually used to select parameters for ordered features. In this case, the features are unordered. Therefore, this research adopts three techniques in ordering features. Furthermore, all these Fused LASSO techniques and LASSO are compared. This paper utilizes data on 1,176 LBW infants collected from the 2017 Indonesian Demographic and Health Survey (IDHS). The results showed that LASSO has the sparsest solutionbased on the 5-fold cross-validation. Thefeatures that contribute to LBW are mothers' occupation, mothers' age, antenatal care, multiple birth, and birth order. However, Fused LASSO 1 has the lowest AIC and BIC valuecompared to other ordering techniques.Ordering features using the correlation between the features and the outcomes is recommended as an alternative technique to sort unordered features.

### Hidden Markov Model for Exchange Rate with EWMA Control Chart

- Research Article in Proceedings of the 1st International Conference on Statistics and Analytics, ICSA 2019, 2-3 August 2019, Bogor, Indonesia
- Authors:
- Rahmawati Ramadhan, Dodi Devianto, Maiyastri Maiyastri
- Abstract:
Nowadays, the US dollar exchange rate is still very influential on the exchange rate stability of many countries, including Indonesia. The effect of the US Dollar exchange rate has caused the fluctuation of Rupiah exchange rate. That is one of the cases that can be modeled with the Hidden Markov Mo…

more »Nowadays, the US dollar exchange rate is still very influential on the exchange rate stability of many countries, including Indonesia. The effect of the US Dollar exchange rate has caused the fluctuation of Rupiah exchange rate. That is one of the cases that can be modeled with the Hidden Markov Model (HMM) as the development of a Markov chain in which its state is not able to be observed directly (hidden), but it is only able to be observed through a set of other observations. In this paper, Exponentially Weighted Moving Average (EWMA) control chart will be used to determine the state of HMM. Based on the EWMA control chart, there are three states which are increase, decrease, and constant. The probability of the changes of exchange rate will be predicted in 2019 with the Baum Welch Algorithm on HMM. By using 240 exchange rate data of US Dollar to Rupiah in 2018, it is predicted the changes of exchange rate in 2019 are increased with a probability of 0.57. The results of HMM have connected to the EWMA control chart where they have eight uncontrolled data with two states increase and six states decrease. Thus, the existence of uncontrolled data implies the probability of increasing of the exchange rate in 2019.

### Hybrid Model of Seasonal ARIMA-ANN to Forecast Tourist Arrivals through Minangkabau International Airport

- Authors:
- Mutia Yollanda, Dodi Devianto
- Abstract:
The number of tourist arrivals forecasting is required for the future development of tourism industry to improve the economic growth. The number tourist arrivals data can be analyzed by building a model so that it will help to find out the number of tourist arrivals in the next period which is thro…

more »The number of tourist arrivals forecasting is required for the future development of tourism industry to improve the economic growth. The number tourist arrivals data can be analyzed by building a model so that it will help to find out the number of tourist arrivals in the next period which is through Minangkabau International Airport. The linear model that is used is Seasonal Autoregressive Integrated Moving Average (SARIMA) used and continued to build a nonlinear model of the residual SARIMA model using Artificial Neural Network (ANN). In this research, SARIMA model which obtained is SARIMA (1, 0, 1) (1, 1, 0)12. But, residual of SARIMA model has not been fulfilled an autocorrelation assumption so that it isproposed a new model of SARIMA-ANN. The residual model of SARIMA is built using the ANN model architecture with 2–2–2–1 network topology. The performance rate of time series model of tourist arrivals which is the data started on January 2012 until March 2019 is measured using Mean Absolute Percentage Error (MAPE). Based on the MAPE value of 17.1770% indicates that the model obtained having good performance to forecast the number of tourist arrivals through Minangkabau International Airport in the future.

### Performance Evaluation of AIC and BIC in Time Series Clustering with Piccolo Method

- Authors:
- Triyani Hendrawati, Aji Hamim Wigena, I Made Sumertajaya, Bagus Sartono
- Abstract:
Piccolo method use parameters of Autoregressive model tocluster time series data. One set of time series data can produce several model, but only one model is used for clustering. Akaike’s Information Criterion (AIC) or Bayesian information Criterion (BIC) can be used to selection model. But if it …

more »Piccolo method use parameters of Autoregressive model tocluster time series data. One set of time series data can produce several model, but only one model is used for clustering. Akaike’s Information Criterion (AIC) or Bayesian information Criterion (BIC) can be used to selection model. But if it is used different criterion to selection model, can be produced different model, so it can cause different cluster. The aim of this research is to evaluate performance of AIC and BIC in time series clustering with Piccolo method. The simulation comparing performance of AIC with BIC in time series clustering using the Piccolo method was carried out. Results shows that Bayesian information Criterion (BIC) is better than Akaike’s information Criterion (AIC).

### Prediction of Number of Claims using Poisson Linear Mixed Model with AR(1) random effect

- Authors:
- Fia Fridayanti Adam, Anang Kurnia, I Gusti Putu Purnaba, I Wayan Mangku
- Abstract:
This study focuses on the number of claims data in an insurance company in Indonesia in 35 locations. The approach taken is a linear Poisson mixed model with two random effects. The response variable is number of claims, the fixed variable is deductibles and random effects are the area and the mo…

more »This study focuses on the number of claims data in an insurance company in Indonesia in 35 locations. The approach taken is a linear Poisson mixed model with two random effects. The response variable is number of claims, the fixed variable is deductibles and random effects are the area and the month of occurrence which is assumed to follow the first-order autoregressive process. Fixed and random component estimation is carried out based on MPQL while estimating component variance is using REML which the initial values are β_0= 0,β_1= 0,σ_v^2= 0.5, and σ_u^2= 1. Modeling is carried out on training data which is 75% of observations and predictions carried out with testing data which is 25% of the observations. Modeling on training and testing data produces accurate models in almost all regions included in the model. This are indicated by the MAPE values which are less than 20% in all regions.

### Quasi Poisson Model for Estimating Under-Five Mortality Rate in Small Area

- Authors:
- Nofita Istiana, Anang Kurnia, Azka Ubaidillah
- Abstract:
Under-Five Mortality Rate (U5MR) is an important indicator because it reflects the socio-economic conditions and developments in health sector. U5MR is obtained from Demographic and Health Survey (DHS) where the level of estimation is designed for national and provincial level. The decentralization…

more »Under-Five Mortality Rate (U5MR) is an important indicator because it reflects the socio-economic conditions and developments in health sector. U5MR is obtained from Demographic and Health Survey (DHS) where the level of estimation is designed for national and provincial level. The decentralization system makes the importance of U5MR for sub-domain of province such as district/municipality level. Small area estimation (SAE) can be used for estimating U5MR in district/municipality level by using a mixed model. The model that is often used is generalized linear mixed model (GLMM). Direct estimation of U5MR produces a large proportion of zero values (excess zero), so the Poisson model is not suitable for modeling the data. Excess zero is the reason for violating the equidispersion in Poisson model. In this study, quasi Poisson modelproduces better predictions than direct estimation. In addition, the U5MR estimation for municipality makes it possible to produce U5MR maps in municipality level.

### Estimating the Poverty level in the Coastal Areas of Mukomuko District Using Small Area Estimation: Empirical Best Linear Unbiased Prediction Method

- Authors:
- Etis Sunandi, Dian Agustina, Herlin Fransiska
- Abstract:
This research aims to estimate the poverty level in the Coastal Areas of Mukomuko District using small area estimation. One of the estimation methods on small area estimation is Empirical Best Linear Unbiased Prediction (EBLUP). using the method, the poverty estimator in the coastal area of Mukomuk…

more »This research aims to estimate the poverty level in the Coastal Areas of Mukomuko District using small area estimation. One of the estimation methods on small area estimation is Empirical Best Linear Unbiased Prediction (EBLUP). using the method, the poverty estimator in the coastal area of Mukomuko District is obtained. The evaluation of parameter estimator is calculated by the value of MSE (Mean Square Error) using Bootstrap resampling method. Based on the result of the study is seen that the MSE value of EBLUP estimators is smaller than the MSE value of the direct estimator in each village. The MSE value of the EBLUP estimators is smaller than the MSE value from the direct estimator for each village. This indicates that the estimation with the EBLUP method can improve the estimation of parameters.

### Random Forest Lag Distributed Regression for Forecasting on Palm Oil Production

- Authors:
- Aulia Rizki Firdawanti, I Made Sumertajaya, Bagus Sartono
- Abstract:
Palm oil is one of the most cultivated potential commodities so it is necessary to do research to determine the determinants of production and forecasting on palm oil production. The objectives are perform data modeling dan forecasting using random forest lag distributed regression on palm oil prod…

more »Palm oil is one of the most cultivated potential commodities so it is necessary to do research to determine the determinants of production and forecasting on palm oil production. The objectives are perform data modeling dan forecasting using random forest lag distributed regression on palm oil production. This analysis combines the lag distributed regression and random forest methods. The results showed that the performances for this model are the correlation value is 0.9302, RMSE is 20.379, MAE is 14.143, and R-Square is 0.829. The 5 most important variables were quantity of palm oil, land area, palm oil age, 8th lag of wind velocity, and 1st lag of temperature. The distribution of data forecasting results are not much different from the distribution of testing data and original data.

### Efficiency of Several Complex Survey Design using EBLUP in Small Area Estimation

- Authors:
- Nadra Yudelsa Ratu, Ika Yuni Wulansari
- Abstract:
Thedissemination of data from the survey was carried out by estimating the parameter of the survey results. The implementation of the survey at BPS now is getting more complex where direct estimation results are presented into small areas. However, the sample size of direct estimation in small area…

more »Thedissemination of data from the survey was carried out by estimating the parameter of the survey results. The implementation of the survey at BPS now is getting more complex where direct estimation results are presented into small areas. However, the sample size of direct estimation in small area has a relatively small size so that it is not reliable enough, not efficient and has low precision. Therefore, other statistics are needed that can accommodate the dissemination from total household expenditure data in the small area. In this study of small area, it was carried out by applying the Small Area Estimation (SAE) method, which is Empirical Best Linear Unbiased Prediction (EBLUP) by involving a complex survey design. The sampling method in complex survey design that used are Simple Random Sampling Without Replacement called SRSWOR, One Stage Cluster (SRSWOR), Two Stage Cluster (SRSWOR-SRSWOR) and Two Stage Cluster (Probability Proportional to Size called PPSWR-SRSWOR). The efficiency of estimation result is evaluated based on MSE and RRMSE values that obtained in each method of the complex survey design. According to the calculation results, the largest MSE and RRMSE value of the estimation was obtained from Two Stage Cluster (SRSWOR-SRSWOR) sampling method. Besides, the smallest MSE and RRMSE value was obtained from the SRSWOR sampling method that seem to have distinct advantage over the other sampling method.