### Reducing Bitrate and Increasing the Quality of Inter Frame by Avoiding Quantization Errors in Stationary Blocks

- Research Article in EAI Endorsed Transactions on Industrial Networks and Intelligent Systems: Online First
- Authors:
- Xuan-Tu Tran, Ngoc-Sinh Nguyen, Duy-Hieu Bui, Minh-Trien Pham, Hung K. Nguyen, Cong-Kha Pham
- Abstract:
In image compression and video coding, quantization error helps to reduce the amount of information of the high frequency components. However, in temporal prediction the quantization error contributes its value as noise in the total residual information. Therefore, the residual signal of the inte…

more »In image compression and video coding, quantization error helps to reduce the amount of information of the high frequency components. However, in temporal prediction the quantization error contributes its value as noise in the total residual information. Therefore, the residual signal of the inter-picture prediction is greater than the expected one and always differs zero value even input video contains only homogeneous frames. In this paper, we reveal negative effects of quantization errors in inter prediction and propose a video encoding scheme which is able to avoid side effects of quantization errors in the stationary parts. We propose to implement a motion detection algorithm as the first stage of video encoding to separate the video into two parts: motion and static. The motion information allows us to force residual data of non-changed part to zero and keep the residual signal of motion regularly. Beside, we design block-based filters which improve motion results and filter those results fit into block encode size well. Fixed residual data of static information permits us to pre-calculate its quantized coefficient and create a bypass encoding path for it. Experimental results with the JPEG compression (MJPEG-DPCM) showed that the proposed method produces lower bitrate than the conventional MJPEG-DPCM at the same quantization parameter and a lower computational complexity.

### Histogram-based Feature Extraction for GPS Trajectory Clustering

- Research Article in EAI Endorsed Transactions on Industrial Networks and Intelligent Systems: Online First
- Authors:
- Chi Nguyen, Thao Dinh, Van-Hau Nguyen, Nhat Phuong Tran, Anh Le
- Abstract:
Clustering trajectories from GPS data is a crucial task for developing applications in intelligent transportation systems. Most existing approaches perform clustering on raw data consisting of series of GPS positions of moving objects over time. Such approaches are not suitable for classifying mo…

more »Clustering trajectories from GPS data is a crucial task for developing applications in intelligent transportation systems. Most existing approaches perform clustering on raw data consisting of series of GPS positions of moving objects over time. Such approaches are not suitable for classifying moving behaviours of vehicles, e.g., how to distinguish between a trajectory of a taxi and a trajectory of a private car. In this paper, we focus on the problem of clustering trajectories of vehicles having the same moving behaviours. Our approach is based on histogram-based feature extraction to model moving behaviours of objects and utilizes traditional clustering algorithms to group trajectories. We perform experiments on real datasets and obtain better results than existing approaches.

### Implementing Extreme Gradient Boosting (XGBoost) Classifier to Improve Customer Churn Prediction

- Research Article in Proceedings of the 1st International Conference on Statistics and Analytics, ICSA 2019, 2-3 August 2019, Bogor, Indonesia
- Authors:
- Iqbal Hanif
- Abstract:
As a part of Customer Relationship Management (CRM), Churn Prediction is very important to predict customers who are most likely to churn and need to be retained with caring programs to prevent them to churn. Among machine learning algorithms, Extreme Gradient Boosting (XGBoost) is a recently popul…

more »As a part of Customer Relationship Management (CRM), Churn Prediction is very important to predict customers who are most likely to churn and need to be retained with caring programs to prevent them to churn. Among machine learning algorithms, Extreme Gradient Boosting (XGBoost) is a recently popular prediction algorithm in many machine learning challenges as a part of ensemble method which is expected to give better predictions with imbalanced-classes data, a common characteristic of customers churn data. This research is aimed to prove or disprove that XGBoost algorithm gives better prediction compared with logistic regression algorithm as an existing algorithm. This research was conducted by using customer’s data sample (both churned and stayed customers) and their behaviors recorded for 6 months from October 2017 to March 2018. There were four phases in this research: data preparation phase, feature selection phase, modelling phase, and evaluation phase. The results show that XGBoost algorithm gives a better prediction than LogReg algorithm does based on its prediction accuracy, specificity, sensitivity and ROC curve. XGBoost model also has a better capability to separate churned customers from not-churned customers than LogReg model does according to KS chart and Gains-Lift charts produced by each algorithm.

### Quasi Poisson Model for Estimating Under-Five Mortality Rate in Small Area

- Research Article in Proceedings of the 1st International Conference on Statistics and Analytics, ICSA 2019, 2-3 August 2019, Bogor, Indonesia
- Authors:
- Nofita Istiana, Anang Kurnia, Azka Ubaidillah
- Abstract:
Under-Five Mortality Rate (U5MR) is an important indicator because it reflects the socio-economic conditions and developments in health sector. U5MR is obtained from Demographic and Health Survey (DHS) where the level of estimation is designed for national and provincial level. The decentralization…

more »Under-Five Mortality Rate (U5MR) is an important indicator because it reflects the socio-economic conditions and developments in health sector. U5MR is obtained from Demographic and Health Survey (DHS) where the level of estimation is designed for national and provincial level. The decentralization system makes the importance of U5MR for sub-domain of province such as district/municipality level. Small area estimation (SAE) can be used for estimating U5MR in district/municipality level by using a mixed model. The model that is often used is generalized linear mixed model (GLMM). Direct estimation of U5MR produces a large proportion of zero values (excess zero), so the Poisson model is not suitable for modeling the data. Excess zero is the reason for violating the equidispersion in Poisson model. In this study, quasi Poisson modelproduces better predictions than direct estimation. In addition, the U5MR estimation for municipality makes it possible to produce U5MR maps in municipality level.

### Hybrid Model of Seasonal ARIMA-ANN to Forecast Tourist Arrivals through Minangkabau International Airport

- Research Article in Proceedings of the 1st International Conference on Statistics and Analytics, ICSA 2019, 2-3 August 2019, Bogor, Indonesia
- Authors:
- Mutia Yollanda, Dodi Devianto
- Abstract:
The number of tourist arrivals forecasting is required for the future development of tourism industry to improve the economic growth. The number tourist arrivals data can be analyzed by building a model so that it will help to find out the number of tourist arrivals in the next period which is thro…

more »The number of tourist arrivals forecasting is required for the future development of tourism industry to improve the economic growth. The number tourist arrivals data can be analyzed by building a model so that it will help to find out the number of tourist arrivals in the next period which is through Minangkabau International Airport. The linear model that is used is Seasonal Autoregressive Integrated Moving Average (SARIMA) used and continued to build a nonlinear model of the residual SARIMA model using Artificial Neural Network (ANN). In this research, SARIMA model which obtained is SARIMA (1, 0, 1) (1, 1, 0)12. But, residual of SARIMA model has not been fulfilled an autocorrelation assumption so that it isproposed a new model of SARIMA-ANN. The residual model of SARIMA is built using the ANN model architecture with 2–2–2–1 network topology. The performance rate of time series model of tourist arrivals which is the data started on January 2012 until March 2019 is measured using Mean Absolute Percentage Error (MAPE). Based on the MAPE value of 17.1770% indicates that the model obtained having good performance to forecast the number of tourist arrivals through Minangkabau International Airport in the future.

### Hidden Markov Model for Exchange Rate with EWMA Control Chart

- Authors:
- Rahmawati Ramadhan, Dodi Devianto, Maiyastri Maiyastri
- Abstract:
Nowadays, the US dollar exchange rate is still very influential on the exchange rate stability of many countries, including Indonesia. The effect of the US Dollar exchange rate has caused the fluctuation of Rupiah exchange rate. That is one of the cases that can be modeled with the Hidden Markov Mo…

more »Nowadays, the US dollar exchange rate is still very influential on the exchange rate stability of many countries, including Indonesia. The effect of the US Dollar exchange rate has caused the fluctuation of Rupiah exchange rate. That is one of the cases that can be modeled with the Hidden Markov Model (HMM) as the development of a Markov chain in which its state is not able to be observed directly (hidden), but it is only able to be observed through a set of other observations. In this paper, Exponentially Weighted Moving Average (EWMA) control chart will be used to determine the state of HMM. Based on the EWMA control chart, there are three states which are increase, decrease, and constant. The probability of the changes of exchange rate will be predicted in 2019 with the Baum Welch Algorithm on HMM. By using 240 exchange rate data of US Dollar to Rupiah in 2018, it is predicted the changes of exchange rate in 2019 are increased with a probability of 0.57. The results of HMM have connected to the EWMA control chart where they have eight uncontrolled data with two states increase and six states decrease. Thus, the existence of uncontrolled data implies the probability of increasing of the exchange rate in 2019.

### Simulation Study to Describe Bayesian Analysis of Nonlinear Structural Equation Modeling

- Authors:
- Ferra Yanuar, Aidinil Zetra
- Abstract:
Structural equation modeling (SEM) has widely used in many disciplines, such as economic, politic or health. Nonlinear structural equation modeling, as part of SEM, also has been developing analytically but still limited. In this method, the parameter models are estimated using conjugate prior in B…

more »Structural equation modeling (SEM) has widely used in many disciplines, such as economic, politic or health. Nonlinear structural equation modeling, as part of SEM, also has been developing analytically but still limited. In this method, the parameter models are estimated using conjugate prior in Bayesian approach. In nonlinear SEM, the models are specified including quadratic forms and/or interactions of latent variables. Posterior mean and posterior variance of the parameters are estimated using iteration approach since it is difficult to estimate those parameters model using analytical approach. The iteration approach used here is Markov Chain Monte Carlo (MCMC) method with Gibbs sampling. The simulation study is done to illustrate the proposed estimation methods for nonlinear model. A group of 300 data are generated to demonstrate the implementation of the proposed method. This study resulted that the proposed nonlinear SEM model could be accepted based on criteria of goodness of fit model.

### Performance Evaluation of AIC and BIC in Time Series Clustering with Piccolo Method

- Authors:
- Triyani Hendrawati, Aji Hamim Wigena, I Made Sumertajaya, Bagus Sartono
- Abstract:
Piccolo method use parameters of Autoregressive model tocluster time series data. One set of time series data can produce several model, but only one model is used for clustering. Akaike’s Information Criterion (AIC) or Bayesian information Criterion (BIC) can be used to selection model. But if it …

more »Piccolo method use parameters of Autoregressive model tocluster time series data. One set of time series data can produce several model, but only one model is used for clustering. Akaike’s Information Criterion (AIC) or Bayesian information Criterion (BIC) can be used to selection model. But if it is used different criterion to selection model, can be produced different model, so it can cause different cluster. The aim of this research is to evaluate performance of AIC and BIC in time series clustering with Piccolo method. The simulation comparing performance of AIC with BIC in time series clustering using the Piccolo method was carried out. Results shows that Bayesian information Criterion (BIC) is better than Akaike’s information Criterion (AIC).

### Hierarchical Generalized Linear Mixed Models for Multilevel Analysis of Indonesian Student’s PISA Mathematics Literacy Achievement

- Authors:
- Tonah Tonah, Anang Kurnia, Kusman Sadik
- Abstract:
Generally, learning assessment and evaluation data in educational has a hierarchical structures one of which is PISA data. Multilevel models are methods that can be used to analyse hierarchical data structures and can be considered as HGLM models. This study has two objectives namely, examine the …

more »Generally, learning assessment and evaluation data in educational has a hierarchical structures one of which is PISA data. Multilevel models are methods that can be used to analyse hierarchical data structures and can be considered as HGLM models. This study has two objectives namely, examine the distribution of variable mathematical literacy and selecting the best HGLM model to determine student and school level variables that significantly influence students' mathematical literacy achievement. The result we have obtained are mathematical literacy achievement has lognormal distribution and M7 model is the best model.

### The Use Of MEWMA Control Chart In Controlling Major Component Of Cement Product

- Authors:
- Surya Puspita Sari, Dodi Devianto
- Abstract:
Cement is one of the industrial products that has a quality control process. Major component that consists of SiO2, Al2O3, Fe2O3, CaO, MgO and SO3 as basic component in cement product. This research explains about quality control of major component of cement by using MEWMAcontrol chart. The way to …

more »Cement is one of the industrial products that has a quality control process. Major component that consists of SiO2, Al2O3, Fe2O3, CaO, MgO and SO3 as basic component in cement product. This research explains about quality control of major component of cement by using MEWMAcontrol chart. The way to measure the performance of the control chart is used ARL as the average run observation to find the first out of control data. ParameterARL0 is the average run observation of in control data. In this research, it is assumed the data was in control. The optimization of ARL0 by weighted parameter of MEWMA control chart for λ=1, that is equal to Hotteling T2 chart. Optimal value of the weight parameter is determined by using the bisection method for then the variables did not show the outlier data. Finally, this research shows that cement production process is in control.

### A New Mixture Distribution for Extreme Excess Zeros: Negative Binomial-Generalized Exponential (NB-GE) Distribution

- Authors:
- Junifsa Afly Prameswari, Ida Fithriani, Siti Nurrohmah
- Abstract:
Negative Binomial-Generalized Exponential (NB-GE) distribution is a distribution that capable for modeling overdispersion data with extreme excess zeros, which is more than 80% zeros in a data. The distribution is a mixture distribution that obtained by mixing the Negative Binomial (NB) distributio…

more »Negative Binomial-Generalized Exponential (NB-GE) distribution is a distribution that capable for modeling overdispersion data with extreme excess zeros, which is more than 80% zeros in a data. The distribution is a mixture distribution that obtained by mixing the Negative Binomial (NB) distribution with the Generalized Exponential (GE) distribution. The formation of the Negative Binomial-Generalized Exponential (NB-GE) distribution and the characteristics of the Negative Binomial-Generalized Exponential (NB-GE) distribution such as the probability density function, kth moment, mean, variance, skewness and kurtosis are discussed in this paper. Estimation of the parameters from the Negative Binomial-Generalized Exponential (NB-GE) distribution using the maximum likelihood method. As an illustration, Negative Binomial-Generalized Exponential (NB-GE) distribution used to model the data of fatal crash that has more than 80% zeros.

### Unordered Features Selection of Low Birth WeightDatain Indonesiausing the LASSO and Fused LASSO Techniques

- Authors:
- Yenni Kurniawati, Khairil Anwar Notodiputro, Bagus Sartono
- Abstract:
This paper aims to analyze the Low Birth Weight (LBW) data of infants in Indonesia by using the LASSO and Fused LASSO techniques. Fused LASSO is usually used to select parameters for ordered features. In this case, the features are unordered. Therefore, this research adopts three techniques in orde…

more »This paper aims to analyze the Low Birth Weight (LBW) data of infants in Indonesia by using the LASSO and Fused LASSO techniques. Fused LASSO is usually used to select parameters for ordered features. In this case, the features are unordered. Therefore, this research adopts three techniques in ordering features. Furthermore, all these Fused LASSO techniques and LASSO are compared. This paper utilizes data on 1,176 LBW infants collected from the 2017 Indonesian Demographic and Health Survey (IDHS). The results showed that LASSO has the sparsest solutionbased on the 5-fold cross-validation. Thefeatures that contribute to LBW are mothers' occupation, mothers' age, antenatal care, multiple birth, and birth order. However, Fused LASSO 1 has the lowest AIC and BIC valuecompared to other ordering techniques.Ordering features using the correlation between the features and the outcomes is recommended as an alternative technique to sort unordered features.

### Using LDA for Innovation Topic of Technology : Quantum Dots Patent Analysis

- Authors:
- Nurmitra Sari Purba, Rani Nooraeni
- Abstract:
This study seeks to explore information about one of nanotechnology, quantum dots (QDs), through analysis of patent information. QDs patent documents obtained from the United States international patent database, the USPTO, use web scraping. In total, 3914 patents from 1988 to 2016 were taken and a…

more »This study seeks to explore information about one of nanotechnology, quantum dots (QDs), through analysis of patent information. QDs patent documents obtained from the United States international patent database, the USPTO, use web scraping. In total, 3914 patents from 1988 to 2016 were taken and archived for analysis. This paper discusses how to apply Latent Dirichlet Allocation (LDA), a topic model, in a trend analysis methodology that exploits patent information. After the text preprocessing and transformation, the number of topics is decided using the log likelihood value. Then LDA model is used for identifying underlying topic structures based on latent relationships of technological words extracted. We extracted words from 6 relevant topics and showed that these topics are highly meaningful in explaining technology applications of QDs.

### Prediction of Number of Claims using Poisson Linear Mixed Model with AR(1) random effect

- Authors:
- Fia Fridayanti Adam, Anang Kurnia, I Gusti Putu Purnaba, I Wayan Mangku
- Abstract:
This study focuses on the number of claims data in an insurance company in Indonesia in 35 locations. The approach taken is a linear Poisson mixed model with two random effects. The response variable is number of claims, the fixed variable is deductibles and random effects are the area and the mo…

more »This study focuses on the number of claims data in an insurance company in Indonesia in 35 locations. The approach taken is a linear Poisson mixed model with two random effects. The response variable is number of claims, the fixed variable is deductibles and random effects are the area and the month of occurrence which is assumed to follow the first-order autoregressive process. Fixed and random component estimation is carried out based on MPQL while estimating component variance is using REML which the initial values are β_0= 0,β_1= 0,σ_v^2= 0.5, and σ_u^2= 1. Modeling is carried out on training data which is 75% of observations and predictions carried out with testing data which is 25% of the observations. Modeling on training and testing data produces accurate models in almost all regions included in the model. This are indicated by the MAPE values which are less than 20% in all regions.

### Determination of General Circulation Model Domain Using LASSO to Improve Rainfall Prediction Accuracy in West Java

- Authors:
- Nanda Fadhli, Aji Hamim Wigena, Anik Djuraidah
- Abstract:
The Statistical downscaling technique has often been used to predict rainfall. This technique needsa domain of general circulation model (GCM) data. The selection of GCM domain is an important factor to improvepredictionaccuracy.The goal of this study is to determine the optimum domain. This study …

more »The Statistical downscaling technique has often been used to predict rainfall. This technique needsa domain of general circulation model (GCM) data. The selection of GCM domain is an important factor to improvepredictionaccuracy.The goal of this study is to determine the optimum domain. This study uses GCM data from CFSRv2 with gridresolution "2.5°×2.5°" and local rainfall data in West Java. The GCM domain is determined basedon minimum correlation value of 0.3 between GCM data and local rainfall data. Correlations are calculated for the grid in the four directions of the compass with one grid as the reference that straightly above the local rainfall station. The domains are evaluated using the regression model with L1 (LASSO) regularization. The result showed that the optimum domain was 8×5 grids.

### Bayesian LASSO Quantile Regression: An Application to the Modeling of Low Birth Weight

- Authors:
- Ferra Yanuar, Aidinil Zetra, Arrival Rince Putri, Yudiantri Asdi
- Abstract:
The modeling of low birth weight using ordinary least square is not appropriate and inefficient. The low birth weight data violates the normality assumption since the data is right skewed. The data usually contains outliers as well. Many researchers used quantile regression approach to model this c…

more »The modeling of low birth weight using ordinary least square is not appropriate and inefficient. The low birth weight data violates the normality assumption since the data is right skewed. The data usually contains outliers as well. Many researchers used quantile regression approach to model this case but this method has limitation. The limitation of this approach is need moderate to big sample size. This study aims to combine the quantile regression with Bayesian LASSO approach to model the low birth weight. Bayesian method has ability to model small sample size since it involves the information related to data (known as likelihood function) and prior information about the parameter tobe estimated (prior distribution). This study demonstrated that Bayesian quantile regression and Bayesian LASSO (Least Absolute Shrinkage Selection Operator)quantile regression could yield the acceptable model of low birth weight case based on indicators of goodness of fit model. Bayesian LASSO quantile regression produced better estimated parameter values since it yielded shorter 95% Bayesian credible interval than Bayesian quantile regression.

### Mean Square Error of Non-Sampled Area in Small Area Estimation

- Authors:
- Faisal Haris, Azka Ubaidillah
- Abstract:
Small area estimation (SAE) is a statistical technique to predict the parameter of subpopulation with small or even zero sample size. An area with zero sample size can be estimated with the support of cluster information. The area random effect assumed has a similarity between region and can be ana…

more »Small area estimation (SAE) is a statistical technique to predict the parameter of subpopulation with small or even zero sample size. An area with zero sample size can be estimated with the support of cluster information. The area random effect assumed has a similarity between region and can be analyzed by clustering the auxiliary variables. In SAE, Mean square error (MSE) is used to compare the precision of parameter estimates. But, there is no study that discuss the MSE of non-sampled area in SAE. The main idea of this research is to modify the existing statistical method by adding the cluster information with the assumption that there are similar characteristics of similar areas. The new method was evaluated by data simulation and case study to check the performance. The data simulation show that all modified methods produce a relatively similar MSE of non-sampled area..

### Confidence Interval for Multivariate Process Capability indices in Statistical Inventory Control

- Authors:
- Mustafid Mustafid, Dwi Ispriyanti, Sugito Sugito, Diah Safitri
- Abstract:
Multivariate process capability indices (MPCI) has important role in the analysis of statistical inventory control determined by several consumer demand as quality characteristics that are correlated. In the inventory control management is also needed confidence interval for MPCI to overcome the un…

more »Multivariate process capability indices (MPCI) has important role in the analysis of statistical inventory control determined by several consumer demand as quality characteristics that are correlated. In the inventory control management is also needed confidence interval for MPCI to overcome the uncertain from consumer demand. The research aims to apply the confidence interval for MPCI in statistical inventory control. The case studies conducted on the apparel industry to implement the confidence interval for the MPCI using several types of apparel which is used as the quality characteristics. The upper and lower limits for the intervals from the MPCI are obtained using sample data assuming multivariate normal distribution and stable. Process sample data in stable conditions are obtained by using analysis of multivariate control diagram designed by T2 Hotelling. The MPCI confidence interval can be used as the indicator in determining the number of products provided in inventory based on the number of consumer demand.

### Comparison of Maximum Likelihood and Generalized Method of Moments in Spatial Autoregressive Model with Heteroskedasticity

- Authors:
- Rohimatul Anwar, Anik Djuraidah, Aji Hamim Wigena
- Abstract:
Spatial dependence and spatial heteroskedasticity are problems in spatial regression. Spatial autoregressive regression (SAR) concerns only to the dependence on lag. The estimation of SAR parameters containingheteroskedasticityusing the maximum likelihood estimation (MLE) method provides biased and…

more »Spatial dependence and spatial heteroskedasticity are problems in spatial regression. Spatial autoregressive regression (SAR) concerns only to the dependence on lag. The estimation of SAR parameters containingheteroskedasticityusing the maximum likelihood estimation (MLE) method provides biased and inconsistent. The alternative method is the generalized method of moments (GMM). GMM uses a combination of linear and quadratic moment functions simultaneously so that the computation is easier than MLE. The bias is used to evaluate the GMM in estimating parameters of SAR model with heteroskedasticity disturbances in simulation data. The results show that GMM provides the bias of parameter estimates relatively consistent and smaller compared to the MLE method.

### Bayes Risk Post-Pruning in Decision Tree to Overcome Overfitting Problem on Customer Churn Classification

- Authors:
- Devina Christianti, Sarini Abdullah, Siti Nurrohmah
- Abstract:
Classification is the process of assigning a set of data into an existing class. Decision tree is claimed to be faster and produces better accuracy compared to another classifier. However, it has some drawbacks in which the classifier is susceptible to overfitting. This problem can be avoided by po…

more »Classification is the process of assigning a set of data into an existing class. Decision tree is claimed to be faster and produces better accuracy compared to another classifier. However, it has some drawbacks in which the classifier is susceptible to overfitting. This problem can be avoided by post-pruning that trimming the small influence subtree in conducting the classification to improve model performance in predicting data. This paper proposes a Post-Pruning method by applying Bayes Risk, in which the risk estimation of each parent node compared with its leaf. This method is applied to two datasets of customer churn classification from the Kaggle site and IBM Datasets with three different sizes for training dataset (60%, 70%, and 80%). For the result, Bayes Risk Post-Pruning can improve decision tree performance and the larger the size of the training dataset was associated with higher accuracy, precision, and recall of the model.

### VARIABLE SELECTION IN ANALYZING LIFE INFANT BIRTH IN INDONESIA USING GROUP LASSO AND GROUP SCAD

- Authors:
- Ita Wulandari, Khairil Anwar Notodiputro, Bagus Sartono
- Abstract:
Regression analysis often requires a selection of explanatory variables X1, X2, ... Xp so shrinkage coefficients can occur that can facilitate the interpretation of the regression equation obtained. In this context, the explanatory variable often has a grouping structure so that a more relevant pro…

more »Regression analysis often requires a selection of explanatory variables X1, X2, ... Xp so shrinkage coefficients can occur that can facilitate the interpretation of the regression equation obtained. In this context, the explanatory variable often has a grouping structure so that a more relevant problem is how to choose groups rather than individuals. Group LASSO and group SCAD are techniques for selecting groups of variables which in many works of literature appear to have advantages over LASSO. In this study, the percentage of live born children in the province of Bali, East Nusa Tenggara and other Indonesia provinces were analyzed and linked to the explanatory variables using the group LASSO and group SCAD methods. The classification of available explanatory variables is grouped based on the theory and results of previous studies. The results show that the best model is the group SCAD method with the smallest AIC, BIC and GCV values. Factors included in the model for Bali province are demographic factors, women's status, and autonomy and the economy. For East Nusa Tenggara province the factors that enter the model are demographics and economics, while generally for Indonesia the factors that are included in the model are demography, women's status, and autonomy and family planning.

### BAYESIAN QUANTILE REGRESSION MODELING TO ESTIMATE EXTREME RAINFALL IN INDRAMAYU

- Authors:
- Eko Primadi Hendri, Aji Hamim Wigena, Anik Djuraidah
- Abstract:
Quantile regression can be used to analyze symmetric or asymmetric data. Estimates of quantile regression parameters are obtained by the simplex method. Another approach is the Bayesian method based on Laplace's asymmetric distribution using MCMC. MCMC is used numerically to estimate parameters fro…

more »Quantile regression can be used to analyze symmetric or asymmetric data. Estimates of quantile regression parameters are obtained by the simplex method. Another approach is the Bayesian method based on Laplace's asymmetric distribution using MCMC. MCMC is used numerically to estimate parameters from each posterior distribution. The Bayesian quantile regression and the quantile regression can be used for statistical downscaling in extreme rainfall cases. This study used statistical downscaling to obtain relationship between global-scale data and local-scale data. The data used were monthly rainfall data in Indramayu and GCM output data. LASSO regularization was used to overcome multicollinearity problems in GCM output data. The purpose of this study was to compare Bayesian quantile regression models with quantile regression. The Bayesian quantile regression and the quantile regression couldpredict extreme rainfallmore accurate and consistent in one year ahead. The Bayesian quantile regression model is relatively better than the quantile regression.

### Estimating the Poverty level in the Coastal Areas of Mukomuko District Using Small Area Estimation: Empirical Best Linear Unbiased Prediction Method

- Authors:
- Etis Sunandi, Dian Agustina, Herlin Fransiska
- Abstract:
This research aims to estimate the poverty level in the Coastal Areas of Mukomuko District using small area estimation. One of the estimation methods on small area estimation is Empirical Best Linear Unbiased Prediction (EBLUP). using the method, the poverty estimator in the coastal area of Mukomuk…

more »This research aims to estimate the poverty level in the Coastal Areas of Mukomuko District using small area estimation. One of the estimation methods on small area estimation is Empirical Best Linear Unbiased Prediction (EBLUP). using the method, the poverty estimator in the coastal area of Mukomuko District is obtained. The evaluation of parameter estimator is calculated by the value of MSE (Mean Square Error) using Bootstrap resampling method. Based on the result of the study is seen that the MSE value of EBLUP estimators is smaller than the MSE value of the direct estimator in each village. The MSE value of the EBLUP estimators is smaller than the MSE value from the direct estimator for each village. This indicates that the estimation with the EBLUP method can improve the estimation of parameters.

### Construction of ANFIS Model Based on LM-Test for Forecasting of Chili Price Data in Semarang

- Authors:
- Tarno Tarno, Di Asih I Maruddani, Rita Rahmawati
- Abstract:
The research aim is constructing Adaptive Neuro-Fuzzy Inference System (ANFIS) model for forecasting time series data. The ANFIS model is constructed and applied to chili price data in Semarang. The daily data are written during December 2018 to May 2019. The input selection in ANFIS is done by usi…

more »The research aim is constructing Adaptive Neuro-Fuzzy Inference System (ANFIS) model for forecasting time series data. The ANFIS model is constructed and applied to chili price data in Semarang. The daily data are written during December 2018 to May 2019. The input selection in ANFIS is done by using theLagrange Multiplier (LM) test. The lag-1 with 2 membership functions is selected as optimal input. The performance of prediction based on in-sample data is measured by the values of mean absolute percentage error (MAPE) and root mean squares error (RMSE). The values of MAPE and RMSE are 2.9% and 939.8 respectively.

### Efficiency of Several Complex Survey Design using EBLUP in Small Area Estimation

- Authors:
- Nadra Yudelsa Ratu, Ika Yuni Wulansari
- Abstract:
Thedissemination of data from the survey was carried out by estimating the parameter of the survey results. The implementation of the survey at BPS now is getting more complex where direct estimation results are presented into small areas. However, the sample size of direct estimation in small area…

more »Thedissemination of data from the survey was carried out by estimating the parameter of the survey results. The implementation of the survey at BPS now is getting more complex where direct estimation results are presented into small areas. However, the sample size of direct estimation in small area has a relatively small size so that it is not reliable enough, not efficient and has low precision. Therefore, other statistics are needed that can accommodate the dissemination from total household expenditure data in the small area. In this study of small area, it was carried out by applying the Small Area Estimation (SAE) method, which is Empirical Best Linear Unbiased Prediction (EBLUP) by involving a complex survey design. The sampling method in complex survey design that used are Simple Random Sampling Without Replacement called SRSWOR, One Stage Cluster (SRSWOR), Two Stage Cluster (SRSWOR-SRSWOR) and Two Stage Cluster (Probability Proportional to Size called PPSWR-SRSWOR). The efficiency of estimation result is evaluated based on MSE and RRMSE values that obtained in each method of the complex survey design. According to the calculation results, the largest MSE and RRMSE value of the estimation was obtained from Two Stage Cluster (SRSWOR-SRSWOR) sampling method. Besides, the smallest MSE and RRMSE value was obtained from the SRSWOR sampling method that seem to have distinct advantage over the other sampling method.