search for


Machine Learning for Flood Prediction in Indonesia: Providing Online Access for Disaster Management Control
Econ. Environ. Geol. 2023 Feb;56(1):65-73
Published online February 28, 2023;
Copyright © 2023 The Korean Society of Economic and Environmental Geology.

Reta L. Puspasari1, Daeung Yoon2, Hyun Kim3, Kyoung-Woong Kim1,*

1School of Earth Sciences and Environmental Engineering, Gwangju Institute of Science and Technology (GIST), Korea
2Department of Energy and Resources Engineering Chonnam National University, Korea
3Division of Environmental Health Sciences, School of Public Health, University of Minnesota, United States
Received February 6, 2023; Revised February 22, 2023; Accepted February 23, 2023.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
As one of the most vulnerable countries to floods, there should be an increased necessity for accurate and reliable flood forecasting in Indonesia. Therefore, a new prediction model using a machine learning algorithm is proposed to provide daily flood prediction in Indonesia. Data crawling was conducted to obtain daily rainfall, streamflow, land cover, and flood data from 2008 to 2021. The model was built using a Random Forest (RF) algorithm for classification to predict future floods by inputting three days of rainfall rate, forest ratio, and stream flow. The accuracy, specificity, precision, recall, and F1-score on the test dataset using the RF algorithm are approximately 94.93%, 68.24%, 94.34%, 99.97%, and 97.08%, respectively. Moreover, the AUC (Area Under the Curve) of the ROC (Receiver Operating Characteristics) curve results in 71%. The objective of this research is providing a model that predicts flood events accurately in Indonesian regions 3 months prior the day of flood. As a trial, we used the month of June 2022 and the model predicted the flood events accurately. The result of prediction is then published to the website as a warning system as a form of flood mitigation.
Keywords : machine learning algorithm, random forest, online access website, flood mitigation, F-score
Research Highlights
  • A prediction model for Indonesian flood is proposed using machine learning algorithm.

  • Random Forest algorithm for classification is used to build a model with the input of three days of rainfall rate, forest ratio, and stream flow.

  • The model shows high performance with the accuracy, specificity, precision, recall, and F1-score approximately 94.93%, 68.24%, 94.34%, 99.97%, and 97.08%, with the AUC (Area Under the Curve) of the ROC (Receiver Operating Characteristics) curve results in 71%.

  • The result of this research is a model that predicts the future flood of Indonesian regions 3 months prior the day of disaster. This prediction is then published to the website as a warning system to as a form of flood mitigation.

1. Introduction

Climate change has altered the hydrological cycle, leading to extreme rainfall events and a high risk of flooding (Tabari, H., 2020). This climate change impact occurs everywhere around the world, including in Indonesia. According to Chapman et al. (2021), Indonesia is ranked 17th out of 191, most at risk from high exposure to flooding by 2019. Flood has always been one of the most significant water issues that caused great loss of life and property damage in Indonesia. In 2019, a total of 4,756,929 victims were affected by flooding, including death, missing, and injuries. Moreover, the total property damage and loss reached $5.5 billion (Chapman et al., 2021).

To prevent the future flood event, the BMKG (Badan Meteorologi, Klimatologi, dan Geofisika, which translated to Meteorology Climatology and Geophysics Council of Indonesia), the Directorate General of Water Resources, and the Ministry of Public Works of Indonesia are collaborating to generate monthly flood prediction maps containing both monthly rainfall forecasts and flood-prone area maps as an attempt of public warning. However, the absence of mathematical and statistical methods in local prediction resulted in hundreds of wrong predictions, including false positive and false negative. A false positive prediction, where it is predicted to be flooding, but in reality, the flood does not occur, results in creating and spreading fear and distrust. Meanwhile, false negative is more dangerous and costly. It happens when the flood is not predicted but occurs in reality. This could lead to unreadiness which cost many things, such as loss of lives, damage to property, the effect of waterborne diseases, loss of livestock, and destruction of crops.

Given the limitations associated with the current prediction capacity of Indonesia, the use of advanced statistical and mathematical methods such as Artificial Intelligent (AI) can improve the prediction accuracy significantly. Machine Learning (ML) is one of the AI branches designed to imitate human intelligence by learning from the data (Leibe et al., 2015). ML algorithms have a wide range of applications in many disciplines, including hydrology. The first application of these methods in hydrology started in the 1990s and has since been extensively employed. The increasing amounts of data stored on the internet, together with the evolving computing power and better ML algorithms to analyze the data, are leading changes in almost every aspect of our lives. Flooding risk and impact assessments are also being influenced by this trend, particularly in areas such as the development of mitigation measures, emergency response preparation, and flood recovery planning (Lamovec et al., 2013; Bonafilia, 2020; Wagenaar et al., 2020).

The rainfall intensity, streamflow, and catchment characteristics such as land coverage are the main features of flooding (Arnell and Gosling, 2016; Asadieh and Krakauer, 2017; Hirabayashi et al, 2013). This study integrates the mentioned features and flood data to predict future flood events in Indonesia. It then provides a prediction model that generates daily flood predictions that are publicly accessible.

Specifically, this study aims to provide a high perform flooding prediction model using an ML algorithm. The use of ML is proposed to minimize erroneous predictions. Moreover, this study introduces an online website that will be freely accessible. This prediction model is expected to support the local prediction by providing extensive information. The current flood warning system is on a monthly basis, where this model could provide daily flood prediction that can predict flooding events three months ahead. This daily flood prediction will be publicly accessible to raise awareness and be used by the local government as a consideration of flood mitigation and management.

2. Materials and Methods

2.1 Study Area

Indonesia is one of the most populous countries in the world, with frequent flood events. Flood in Indonesia are caused by many factors that need to be considered, such as climate, catchment characteristics, topography, socioeconomy, poor water management system, etc. This study covers all regions of Indonesia, with a total of 33 regions. Each region has one or several climatological stations which the data has gathered from.

2.2. Data Sources

A total of 534,536 time series daily rainfall data were gathered to build the model. The data cover 14 years, from 2008 to 2021, collected from 165 weather stations registered on the BMKG website, accessible at https://dataonline. Besides rainfall, catchment characteristics also make significant contribution to flooding. Due to the limitation of available data, land cover and streamflow are the two parameters chosen to be included in the model. These data are obtained from BPS (Badan Pusat Statistik, translated to Central Bureau of Statistics of Indonesia), accessible at

Flood events in Indonesia are recorded by BNPB (Badan Nasional Penanggulangan Bencana, translated to National Disaster Management Agency of Indonesia), accessible at The data collection to build the model started with the oldest record of flood data available, which was in 2008.

Table 1 . Example of data

NoRegionStationDateRainfallForest RatioStreamflowFlood
1AcehMalikus Saleh2008-01-0100.599.97No Flood
2BaliPos Kahang2008-01-01130.2211.6No Flood
3Bangka BelitungAHS2010-01-01250.3938.8Flood
5BengkuluBengkulu2008-01-0100.4669.5No Flood

2.3. Local Prediction Evaluation

As mentioned in Section 1, the local prediction is generated from the collaboration of BMKG, the Directorate General of Water Resources, and the Ministry of Public Works. The monthly rainfall forecast and flood-prone area maps are utilized as the prediction input. This local prediction consists of Tinggi” (High”), Menengah” (Moderate”), and Rendah” (Low”) labels depending on the monthly rainfall forecast.

A total of 3,042 monthly prediction data, consisting of 3 years period of each region in Indonesia, were gathered to evaluate. The comparison between past local predictions and real flood events confirmed hundreds of incorrect predictions, as presented in Table 2. This means when it is predicted High”, Moderate” and Low” risk, a flood did not occur. However, where it is predicted as -“ or No Flood”, the flood occurred. The summary of the comparison of local prediction is presented in Table 2 below.

Table 2 . The summary of local predictions


2.4. Random Forests (RF)

Decision Tree (DT) is one of the ML methods which contributes to prediction models with a wide application in flood prediction. DT uses a decision tree from branches to target the values of leaves. The Random Forests (RF) method is one of the most popular methods of DT for flood prediction (Wang, et al, 2015).

The RF classification algorithm consists of many decision trees. It uses bagging and feature randomness when building each tree to create an uncorrelated forest, resulting in more accuracy than a single decision tree. The final decision is made by majority voting of the decision trees. According to Mosavi, et al. (2015), the RF method was proven as efficient and robust as more popular methods such as Artificial Neural Network (ANN).

The RF algorithm for classification consists of Bagging (Bootstrap Aggregating). Bootstrap, also called resampling, is a technique to improve the quality of estimators. Resampling means that the sampling is done from the empirical distribution.

Proposed by Breiman (1996), the bagging algorithm, also called Bootstrap Aggregating, is an ensemble method that is used to decrease the variance of an estimator, which improves the robustness of a forecast. It employs the Weak Law of Large Numbers (WLLN), which states the sample average converges in probability toward the expected value.

Sample mean,

Mn=1ni=1 n1Xi

The sample mean is an unbiased estimator of mx if Exi = mx, i = 1, …, n. Emn = mx,


Proven by Chebyshev inequality,


2.5. Model Label and Input

Before training the model, the dataset was binarized with the flood events as a label. The label for dataset is 1 for flood and 0 for no flood. The model considered three days of each feature to predict future floods. These inputs included in the model were rainfall rate, forest ratio, and streamflow. The details of the model input are presented in Table 3.

Table 3 . Input and description

R-0today’s rain (D-0)
R-1yesterday’s rain (D-1)
R-2rain 2 days ago (D-2)
Forest_Ratio-0today’s forest ratio (D-0)
Forest_Ratio-1yesterday’s forest ratio (D-1)
Forest_Ratio-2forest ratio 2 days ago (D-2)
Streamflow-0today’s river flow (D-0)
Streamflow-1yesterday’s river flow (D-1)
Streamflow-2river flow 2 days ago (D-2)
Floodtomorrow’s flood (flood next day / D+1)

2.6. Model Development

The procedure of this study includes data gathering, data pre-processing, model development, model assessment, and deployment of the website. Data pre-processing eliminates duplicates and outliers, predicts missing values, data normalization, and data visualization. In this phase, the data are also randomly split into two sets, training sets, and testing sets. The training sets will be used by the model to learn the environment and testing sets will be used to examine the model's performance. The ratio of data split for training and testing is 8:2, respectively.

The model is developed using RF methods and evaluated using model assessment methods. After confirming the generation of a highly accurate model, the flood prediction produced by the model will be deployed to a website that local governments can freely access for consideration on flood control.

Analyses and model development processes are conducted using R as a programming language using the RStudio platform. R is well-known to be used for data mining, bioinformatics, and statistical research, including to develop statistical software. In this study, R version 4.1.1 is used in the MacBookPro Retina 13-inch with macOS Big Sur ver. 11.6.5, 2.6 GHz Dual-Core Intel Core i5 processor and 16 GB 1600 MHz DDR3 memory. The sequential diagram of this study is presented in Fig. 2.

Figure 1. Indonesia map with 33 regions.
Figure 2. Flowchart of model development.

2.7. Model Assessments

2.7.1. Confusion Matrix

A confusion matrix was proposed by Kohavi and Provost in 1998 as a performance measurement. It contains information on the actual and predicted values of a classification algorithm. The model's performance is commonly evaluated using the information in the matrix. The matrix for a 2-class classification consists of:

• True Positive (TP): the model predicted that the value is positive and correct.

• False Positive (FP) / Type I Error: the model predicted the value is positive, and it is not correct.

• False Negative (FN) / Type II Error: the model predicted the value is negative, and it is not correct.

• True Negative (TN): the model predicted that the value is negative and correct.

2.7.2. F measure (F1-score)

F measure is well-known as its usage to evaluate the binary classification model. It determines the model's accuracy on a dataset. The perfect F measure is 1. In this study, the F measure is an essential measurement of model performance because, in disaster prediction, false positive and false negative is sensitive and crucial. F measure is defined as a harmonic mean of Precision and Recall (Sasaki, 2007). It uses harmonic mean (H) in place of the arithmetic mean (A) by punishing the extreme values more.

2.7.3. Receiver Operating Characteristic (ROC) Curve and Area Under the Curve (AUC)

The Receiver Operating Characteristic (ROC) graph is used for assessing the classifier model. The graph plots the false positive rate as its x-axis and true positive rate as the y-axis. The perfect model is shown by the point (0,1), which predicts all positive and negative cases correctly. In a ROC curve, the True Positive rate (TP = Sensitivity) is plotted as a function of the False Positive rate (FP = 1 - Specificity). Each point on the ROC curve represents a sensitivity/specificity pair corresponding to a particular decision threshold. The area under the ROC curve (AUC) is a measure of how well a parameter can distinguish between two diagnostic groups (diseased/normal). The ROC curve is a fundamental tool for diagnostic test evaluation.

2.8. Variable Importance and Partial Dependence Plot

The variable importance measures the impact or causal effect of the predictor variables (features) in predicting the dependent variable (Strobl et al., 2008). Where the partial dependence plot (PDP) shows the dependence between the dependent variable and a (set of) feature(s) (Pedregosa et al., 2011).

2.9. Flood Prediction of June 2022 and R-Shiny Website

A series of daily forecast of rainfall is harvested from three (3) months prior the real events. This rainfall data was imputed to the prediction model to obtain flood prediction for June 2022 for all 33 regions in Indonesia.

Shiny is a user-friendly web application framework for the R programming language that allows users to create and deploy a publicly accessible website. The website is designated to inform the result of flood prediction based on the RF model.

2.10. Disaster Management Control

Disaster management is a comprehensive approach to preventing, preparing for, responding to, and aiding emergency recovery efforts. There are four main phases: mitigation, preparedness, response, and recovery. This study will contribute to the mitigation phase by providing information for early warning.

Early Warning (EW) is the provision of timely and effective information, through identified institutions, that allows individuals exposed to hazard to take action to avoid or reduce their risk and prepare for effective response.” (ISDR UN, 2006). EW consists of 4 key factors: risk knowledge, monitoring and warning service, dissemination and communication, and response capability.

3. Results and Discussions

3.1. Model Performance

The model was built using the RF algorithm for classification and saved in .rds format. The chosen number of trees grown for the model is 500 with mtry value is 3. The tree complexity is shown by the node size of the trees presented in Fig. 3. As shown in Fig. 3. the most frequent occurrence of tree size has close to 45,000 nodes, which presents what the majority of the trees in the forest resemble. Training time for the model took 3 hours and 42 minutes, whereas inferencing time to predict June 2022 future flood was 4 minutes. Model evaluation using a confusion matrix indicated that the model performs well, as shown in the model evaluation presented in Table 4 below.

Table 4 . Model performance

f-1 score97.08

Figure 3. The number of nodes for the trees.

To further evaluate the model, the ROC curve is plotted and displayed in Fig. 4. The AUC of this ROC curve results in 0.71 or 71%. As discussed in Section 2.6.2., the AUC indicates a good model when the value is close to 1 or 100%. This ROC and AUC result showed an acceptable value for the prediction model.

Figure 4. The ROC curves.

3.2. Variable Importance and PDP

The PDPs indicate that flood in Indonesia highly depends on rainfall rate value. Furthermore, the variable importance plot shows the most to the least important variables: rainfall rate, forest ratio, and streamflow. The PDP graphs of each feature and Variable of Importance are presented in Fig. 5 and Fig. 6, respectively.

Figure 5. Partial dependence plot on RR (rainfall rate), Streamflow and Forest Ratio.

Figure 6. Variable of Importance.

Fig. 6. shows that the most important feature of a flood was RR-0, which means that the flood occurs on the same day as the heavy rain. The variable of importance suggested that the majority of flood events in Indonesia are flash flooding. In this study, the PDP and variable of importance indicated that rainfall is the most important feature influencing flood occurrence.

3.3. Future Flood Prediction

The model was used to predict daily flood prediction for June 2022. The default result of the RF algorithm was binary, which was 1 for flood and 0 for no flood. However, the result can also be presented as the probability of flood occurrence. It means that when the model predicts a 50% probability of flood, the average of 500 trees voted for 50% of the flood event: 250 trees voted for 1 and the other 250 trees voted for 0.

In this study, the result is decided to be presented in probability using a percentage (%) format. This decision is considering several factors such as,

The familiarity of the human brain with meteorological prediction in probability (numbers in %) instead of binary (0 or 1) or (yes or no).

The level of preparedness or readiness, the threshold of the RF algorithm, is 0.5, where the number of trees is 500. This means that if 251 trees voted for 0 (no flood), the final decision would be 0. However, 249 trees that voted for 1 (flood), which covers a 49.8% vote rate, were ignored. In probability format, the user could see the number of chances for flood events.

The prediction result dissemination would be accessible freely worldwide. Considering point a). the prediction should be easy to understand by the users without any help for interpretation.

3.4. Online Access Website

A website was developed and deployed to better inform the result of flood prediction. This prediction is in daily format and could be generated three months before the predicted event. The website is available at Quick access to the flood warning website is provided by QR code in Fig. 7.

Figure 7. QR code for quick access to flood warning Indonesia website.
4. Conclusions

ML algorithm eventually becomes an essential part of the prediction model due to its powerful ability to make decisions. This study supports the arguments mentioned in many literatures in Chapter II, which state that RF is a sufficient and efficient algorithm to build a model with high performance. The model shows high performance in accuracy, precision, sensitivity, specificity, recall, F1-score, ROC, and AUC.

This study provides daily flood prediction instead of monthly to support the current one (local prediction). The daily prediction is presented in probability, similar to a rain forecast, and deployed to this website:

Another finding of this study is the indication of the type of flood in Indonesia. The variable importance and PDP curves suggest that most flood events are flash flooding. This is revealed by the most important variable, which is the rainfall rate, supported by the graph of PDP in Fig. 5.

Automation is highly recommended for future work. It would reduce the workload of humans by cutting off many processes, such as the repetition of data crawling, cleaning, predicting, and deploying. However, an AI should augment human work, not replace it. Supervision of the algorithm is still highly needed, including updating the model based on a new database.

Another recommendation for future work would be to add more data and input features to the model. Eventually, more data will be provided online. In this study, the prediction was made on a provincial scale. However, when more detailed information is accessible, the model should be improved so that the prediction will cover a more precise area. Not only area but more features also should be input into the model, such as slope, physical properties of soil, and details of land coverage.


We would like to acknowledge BMKG, BPS, BNPB for publicly available climate, catchment characteristics and flood data. Moreover, we would like to acknowledge the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. 2021R1A2C1094272) and GIST Research Institute (GRI) grant funded by the GIST in 2022.

  1. Arnell, N.W. and Gosling, S.N. (2016) The impacts of climate change on river flood risk at the global scale. Clim. Change, v.134(3), p.387-401. doi: 10.1007/s10584-014-1084-5
  2. Asadieh, B. and Krakauer, N.Y. (2017) Global change in streamflow extremes under climate change over the 21st century. Hydrol. Earth Syst. Sci., v.21(11), p.5863-5874. doi: 10.5194/hess-21-5863-2017
  3. Bonafilia, D., Tellman, B., Anderson, T. and Issenberg, E. (2020) Sen1Floods11: A georeferenced dataset to train and test deep learning flood algorithms for sentinel-1. IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2020-June, p.835-845, doi: 10.1109/CVPRW50498.2020.00113
  4. Breiman, L. (1996) Bagging predictors. Mach Learn, v.24, p.123-140. doi: 10.1007/BF00058655
  5. Chapman, A., Davies, W., Downey, C. and Dookie, D. (2021) ADB Climate Risk Country Profile: Micronesia. Think-Asia
  6. Hirabayashi, Y., Mahendran, R., Koirala, S., Konoshima, L., Yamazaki, D., Watanabe, S., ... and Kanae, S. (2013). Global flood risk under climate change. Nat. Clim. Change, v.3(9), p.816-821. doi: 10.1038/nclimate1911
  7. International Strategy for Disaster Reduction (ISDR) (2006) What's early warning. Accessible at:
  8. Lamovec, P., Veljanovski, T., Mikoš, M. and Oštir, K. (2013) Detecting flooded areas with machine learning techniques: case study of the Selška Sora River flash flood in September 2007. J. Appl. Remote Sens., v.7(1), p.073564. doi: 10.1117/1.jrs.7.073564
  9. Leibe B., Matas J., Sebe N. and Welling M. (2015) What is Machine Learning?. Lecture Notes in Computer Science. Springer, Cham., v.9905. doi: 10.1007/978-3-319-46448-0_2
  10. Martinez-Taboada, F. and Redondo, J.I. (2020) Variable importance plot (mean decrease accuracy and mean decrease Gini). PLOS ONE. doi: 10.1371/journal.pone.0230799.g002
  11. Pedregosa, F., Varoquaux, G., Gramfort, A, Michel, V., Thirion, B, Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M. and Duchesnay, E. (2011) Scikit-learn: Machine learning in Python. J. Mach Learn Res., v.12, p.2825-2830.
  12. Sasaki, Y. (2007) The truth of the F-measure. Teach Tutor Mater, v.26.
  13. Strobl, C., Boulesteix, A. L., Kneib, T., Augustin, T. and Zeileis, A. (2008) Conditional variable importance for random forests. BMC Bioinform., v.9, p.1-11. doi:10.1186/1471-2105-9-307
    Pubmed KoreaMed CrossRef
  14. Tabari, H. (2020) Climate change impact on flood and extreme precipitation increases with water availability. Sci. Rep., v.10(1), p.1-10. doi: 10.1038/s41598-020-70816-2
    Pubmed KoreaMed CrossRef
  15. Wagenaar, D., Curran, A., Balbi, M., Bhardwaj, A., Soden, R., Hartato, E., Mestav Sarica, G., Ruangpan, L., Molinario, G. and Lallemant, D. (2020) Invited perspectives: How machine learning will change flood risk and impact assessment. Nat. Hazards Earth Syst. Sci., v.20(4), p.1149-1161. doi: 10.5194/nhess-20-1149-2020
  16. Wang, Z., Lai, C., Chen, X., Yang, B., Zhao, S. and Bai, X. (2015) Flood hazard risk assessment model based on random forest. J. Hydrol., v.527, p.1130-1141. doi: 10.1016/j.jhydrol.2015.06.008
  17. Yu, P.-S., Yang, T.-C., Chen, S.-Y., Kuo, C.-M. and Tseng, H.-W. (2017) Comparison of random forests and support vector machine for real-time radar-derived rainfall forecasting. J. Hydrol, v.552, p.92-104, doi: 10.1016/j.jhydrol.2017.06.020


February 2023, 56 (1)