Special Research Paper on “Applications of Data Science and Artificial Intelligence in Economic and Environmental Geology”

Split Viewer

Econ. Environ. Geol. 2024; 57(5): 473-486

Published online October 29, 2024

https://doi.org/10.9719/EEG.2024.57.5.473

© THE KOREAN SOCIETY OF ECONOMIC AND ENVIRONMENTAL GEOLOGY

Application of Deep Learning and Optical Character Recognition Technology to Automate Classification and Database of Borehole Log for Ground Stability Investigation of Abandoned Mines

Hosang Han1, Jangwon Suh2,*

1Department of Energy and Mineral Resources Engineering, Kangwon National University, Samcheok, Korea
2Department of Energy Resources and Chemical Engineering, Kangwon National University, Samcheok, Korea

Correspondence to : *jangwonsuh@kangwon.ac.kr

Received: September 6, 2024; Revised: October 1, 2024; Accepted: October 3, 2024

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided original work is properly cited.

Abstract

Boring logs are essential for the evaluation of ground stability in abandoned mine areas, representing geomaterial and subsurface structure information. However, because boring logs are maintained in various analog formats, extracting useful information from them is prone to human error and time-consuming. Therefore, this study develops an algorithm to efficiently manage and analyze boring log data for abandoned mine ground investigation provided in PDF format. For this purpose, the EfficientNet deep learning model was employed to classify the boring logs into five types with a high classification accuracy of 1.00. Then, optical character recognition (OCR) and PDF text extraction techniques were utilized to extract text data from each type of boring log. The OCR technique resulted in many cases of misrecognition of the text data of the boring logs, but the PDF text extraction technique extracted the text with very high accuracy. Subsequently, the structure of the database was established, and the text data of the boring logs were reorganized according to the established schema and written as structured data in the form of a spreadsheet. The results of this study suggest an effective approach for managing boring logs as part of the transition to digital mining, and it is expected that the structured boring log data from legacy data can be readily utilized for machine learning analysis.

Keywords boring log, deep learning, optical character recognition, database construction, smart mining

폐광산 지반안정성 조사용 시추주상도의 분류 및 데이터베이스화를 위한 딥러닝 및 광학문자인식 기술의 적용

한호상1 · 서장원2,*

1강원대학교 에너지자원융합공학과 박사과정
2강원대학교 에너지자원화학공학과 부교수

요 약

시추주상도는 지질매체와 지하구조 정보를 나타내며, 폐광산 지역의 지반 안정성 평가에 필수적으로 사용되는 중요한 자료이다. 다만 시추주상도는 양식이 다양하고 아날로그 형태로 관리되고 있어 이로부터 유용한 정보를 도출하는 과정에는 인적 오류가 발생되거나 시간 및 비용이 소모된다는 단점이 있다. 따라서 본 연구에서는 PDF 파일 형식으로 제공되는 폐광산 지반조사용 시추주상도 데이터를 효율적으로 관리하고 분석할 수 있는 알고리즘을 개발하였다. 이를 위해 EfficientNet 딥러닝 모델을 사용하여 시추주상도를 5개 유형으로 분류하였으며, 분류 정확도는 1.00으로 매우 높게 나타났다. 이후 분류된 각 유형별 시추주상도를 광학문자인식(optical character recognition, OCR) 기술과 PDF 텍스트 추출 기법을 활용하여 텍스트를 추출하였다. OCR기술은 시추주상도의 텍스트 데이터를 오인식하는 결과가 다수 발생하였으나, PDF 텍스트 추출 기법은 매우 높은 정확도로 텍스트를 추출하였다. 이후 데이터베이스의 구조를 정립하고, 설계된 구조에 따라 시추주상도의 텍스트 데이터를 재구성하여 스프레드시트 형태의 정형 데이터로 작성하였다. 본 연구결과는 디지털 광산으로의 전환에 있어 효과적인 시추주상도 관리 방안을 제시하며, 레거시 데이터로부터 정형화된 시추주상도 데이터는 머신러닝 분석에 용이하게 활용될 수 있을 것으로 기대한다.

주요어 시추주상도, 딥러닝, 광학문자인식, 데이터베이스 구축, 스마트 마이닝

Article

Special Research Paper on “Applications of Data Science and Artificial Intelligence in Economic and Environmental Geology”

Econ. Environ. Geol. 2024; 57(5): 473-486

Published online October 29, 2024 https://doi.org/10.9719/EEG.2024.57.5.473

Copyright © THE KOREAN SOCIETY OF ECONOMIC AND ENVIRONMENTAL GEOLOGY.

Application of Deep Learning and Optical Character Recognition Technology to Automate Classification and Database of Borehole Log for Ground Stability Investigation of Abandoned Mines

Hosang Han1, Jangwon Suh2,*

1Department of Energy and Mineral Resources Engineering, Kangwon National University, Samcheok, Korea
2Department of Energy Resources and Chemical Engineering, Kangwon National University, Samcheok, Korea

Correspondence to:*jangwonsuh@kangwon.ac.kr

Received: September 6, 2024; Revised: October 1, 2024; Accepted: October 3, 2024

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided original work is properly cited.

Abstract

Boring logs are essential for the evaluation of ground stability in abandoned mine areas, representing geomaterial and subsurface structure information. However, because boring logs are maintained in various analog formats, extracting useful information from them is prone to human error and time-consuming. Therefore, this study develops an algorithm to efficiently manage and analyze boring log data for abandoned mine ground investigation provided in PDF format. For this purpose, the EfficientNet deep learning model was employed to classify the boring logs into five types with a high classification accuracy of 1.00. Then, optical character recognition (OCR) and PDF text extraction techniques were utilized to extract text data from each type of boring log. The OCR technique resulted in many cases of misrecognition of the text data of the boring logs, but the PDF text extraction technique extracted the text with very high accuracy. Subsequently, the structure of the database was established, and the text data of the boring logs were reorganized according to the established schema and written as structured data in the form of a spreadsheet. The results of this study suggest an effective approach for managing boring logs as part of the transition to digital mining, and it is expected that the structured boring log data from legacy data can be readily utilized for machine learning analysis.

Keywords boring log, deep learning, optical character recognition, database construction, smart mining

폐광산 지반안정성 조사용 시추주상도의 분류 및 데이터베이스화를 위한 딥러닝 및 광학문자인식 기술의 적용

한호상1 · 서장원2,*

1강원대학교 에너지자원융합공학과 박사과정
2강원대학교 에너지자원화학공학과 부교수

Received: September 6, 2024; Revised: October 1, 2024; Accepted: October 3, 2024

요 약

시추주상도는 지질매체와 지하구조 정보를 나타내며, 폐광산 지역의 지반 안정성 평가에 필수적으로 사용되는 중요한 자료이다. 다만 시추주상도는 양식이 다양하고 아날로그 형태로 관리되고 있어 이로부터 유용한 정보를 도출하는 과정에는 인적 오류가 발생되거나 시간 및 비용이 소모된다는 단점이 있다. 따라서 본 연구에서는 PDF 파일 형식으로 제공되는 폐광산 지반조사용 시추주상도 데이터를 효율적으로 관리하고 분석할 수 있는 알고리즘을 개발하였다. 이를 위해 EfficientNet 딥러닝 모델을 사용하여 시추주상도를 5개 유형으로 분류하였으며, 분류 정확도는 1.00으로 매우 높게 나타났다. 이후 분류된 각 유형별 시추주상도를 광학문자인식(optical character recognition, OCR) 기술과 PDF 텍스트 추출 기법을 활용하여 텍스트를 추출하였다. OCR기술은 시추주상도의 텍스트 데이터를 오인식하는 결과가 다수 발생하였으나, PDF 텍스트 추출 기법은 매우 높은 정확도로 텍스트를 추출하였다. 이후 데이터베이스의 구조를 정립하고, 설계된 구조에 따라 시추주상도의 텍스트 데이터를 재구성하여 스프레드시트 형태의 정형 데이터로 작성하였다. 본 연구결과는 디지털 광산으로의 전환에 있어 효과적인 시추주상도 관리 방안을 제시하며, 레거시 데이터로부터 정형화된 시추주상도 데이터는 머신러닝 분석에 용이하게 활용될 수 있을 것으로 기대한다.

주요어 시추주상도, 딥러닝, 광학문자인식, 데이터베이스 구축, 스마트 마이닝

    Fig 1.

    Figure 1.Flowchart of the research procedures used in this study.
    Economic and Environmental Geology 2024; 57: 473-486https://doi.org/10.9719/EEG.2024.57.5.473

    Fig 2.

    Figure 2.Components of a boring log to assess ground stability.
    Economic and Environmental Geology 2024; 57: 473-486https://doi.org/10.9719/EEG.2024.57.5.473

    Fig 3.

    Figure 3.Five examples of reports by type of boring log to assess ground stability.
    Economic and Environmental Geology 2024; 57: 473-486https://doi.org/10.9719/EEG.2024.57.5.473

    Fig 4.

    Figure 4.Structure of the EfficientNet-B3 deep learning model.
    Economic and Environmental Geology 2024; 57: 473-486https://doi.org/10.9719/EEG.2024.57.5.473

    Fig 5.

    Figure 5.Confusion matrix for evaluating the accuracy and reliability of classification result.
    Economic and Environmental Geology 2024; 57: 473-486https://doi.org/10.9719/EEG.2024.57.5.473

    Fig 6.

    Figure 6.Confusion matrix of boring logs classified using the EfficientNet-B3 deep learning model.
    Economic and Environmental Geology 2024; 57: 473-486https://doi.org/10.9719/EEG.2024.57.5.473

    Fig 7.

    Figure 7.Variation in learning loss over the number of epoch runs.
    Economic and Environmental Geology 2024; 57: 473-486https://doi.org/10.9719/EEG.2024.57.5.473

    Fig 8.

    Figure 8.The result of extracting text from a single page of a boring log PDF file and saving it to a single CSV file.
    Economic and Environmental Geology 2024; 57: 473-486https://doi.org/10.9719/EEG.2024.57.5.473

    Fig 9.

    Figure 9.Example of boring log data implemented based on the designed DB schema.
    Economic and Environmental Geology 2024; 57: 473-486https://doi.org/10.9719/EEG.2024.57.5.473

    Fig 10.

    Figure 10.The result of the OCR engine recognizing and highlighting the text on the boring log page. (left) Original images; (right) Highlighted image.
    Economic and Environmental Geology 2024; 57: 473-486https://doi.org/10.9719/EEG.2024.57.5.473

    Table 1 . Number of abandoned mine ground stability investigation reports and boring logs used in this study..

    ItemNumber
    Ground stability investigation reportea47
    pages2961
    Boring logs (in reports)ea403
    pages908

    Table 2 . The number of abandoned mine ground stability investigation reports and boring logs by types..

    ClassReportBorehole log (in report)
    Number%Pages%
    Type 11634.017519.7
    Type 22348.954561.4
    Type 348.5857.2
    Type 436.4192.1
    Type 512.1849.5
    Sum47100.0908100.0

    Table 3 . Comparison of the advantages and disadvantages of the CNN family of models..

    ModelAdvantagesDisadvantagesReason for Selection
    VGGNet1- Simple and uniform architecture
    - Good feature extraction
    - Large number of parameters
    - High computational cost
    - Memory intensive
    - Not selected due to high computational requirements and potential overfitting
    Inception2- Efficient use of computational resources
    - Reduced number of parameters
    - Complex architecture
    - Difficult to modify
    - Not selected due to complexity and less flexibility for our specific task
    DenseNet3- Feature reuse
    - Reduced number of parameters
    - Strong gradient flow
    - Memory intensive during training
    - Computationally expensive for very deep networks
    - Considered but not selected due to memory constraints in our setup
    ResNet4- Solves vanishing gradient problem
    - Can be very deep
    - Good performance on various tasks
    - Still relatively large number of parameters
    - Training very deep versions can be time-comsuming
    - Strong contender, but not selected due to EfficientNet’s better efficiency
    EfficientNet4- Scalable to different computational budgets- Relatively new, less extensively tested
    - May require careful tuning of compound scaling
    - Selected due to its balance of efficiency and performance, and its adaptability to our computational resources

    1. Simonyan and Zisserman, 2014..

    2. Szegedy et al., 2015..

    3. Huang et al., 2017..

    4. Tan and Le, 2019..


    Table 4 . The hyperparameters setting for the EfficientNet-B3 deep learning model..

    HyperparameterValue
    Input Size300 × 300
    Batch Size32
    Number of Classes5
    OptimizerAdam
    Learning Rate0.001
    Loss FunctionCrossEntropyLoss
    Number of Epochs20
    TransformationsResize, ToTensor, Normalize
    Normalization Mean[0.485, 0.456, 0.406]
    Normalization Standard Deviation[0.229, 0.224, 0.225]
    HardwareCUDA* (if available) / CPU

    * CUDA: Compute Unified Device Architecture.


    Table 5 . Design of DB schema fields (header and body information)..

    Header informationBody information
    Field name-사업명
    -시추공번
    -조사일
    - 위치(X)
    - 위치(Y)
    -지반표고(m)
    -굴진심도(m)
    -시추방법
    -지하수위(m)
    -케이싱심도(m)
    -시추기
    -시추공경
    -지층명
    -심도(m)
    -표고(m)
    -두께(m)
    -설명
    - TCR(%)
    - RQD(%)
    - D
    - S
    - F
    -절리간격_최대(cm)
    -절리간격_최소(cm)
    -절리간격_평균(cm)

    Table 6 . Brief comparison of misrecognized examples from different OCR techniques for header information fields..

    Original textRecognized text
    CLOVA OCREasyOCR
    -사업명
    -시추공번
    -조사일
    -위치
    -표고
    -굴진심도
    -시추방법
    -지하수위
    -케이싱심도
    -시추기
    -시추공경
    - 사업영 X
    - 시추공번 ○
    - 조사 일 X
    -위 치 X
    -표 고 ○
    - 굴진심도 ○
    - 시추방법 ○
    - 지하수위 ○
    -케이싱싱도 X
    - 시추기 ○
    - 시추공경 ○
    -사 업 영 X
    -시추공번 ○
    -조 사 입 X
    -뭐 지 X
    -포, 고 X
    -굽진심도 X
    - N구방번 X
    -지하수위 ○
    -캐이싱심도 X
    -시 수 기 X
    -시즌공럽 X

    X: Misrecognized, ○: Correct.


    Table 7 . Examples of misrecognized text data on a boring log using the OCR engine.

    Original textExample of misrecognized text
    사업명사업영, 사영영, 사명명, 면명, 사영명, 사험명, 사엉명, 사업펄, 사업형 등
    시추공번시추공법, 시추공변, 시추공방, 시추공연 등
    조사일초사일, 조사할 등
    위치-
    지반표고-
    굴진심도굴진산도, 금전성도, 금전상도, 굴진실도, 금진삼도, 골진삼도, 궁진심도 등
    케이싱심도케이싱싱도, 케이심심도, 케이성이도, 케이침실도, 케이십삼도, 케이십삼도 등
    시추방법시추불편 등
    지하수위자동수위 등
    시추기시주기, 시주거 등
    시추공경시추공정, 시추공검 등

    KSEEG
    Dec 31, 2024 Vol.57 No.6, pp. 665~835

    Stats or Metrics

    Share this article on

    • kakao talk
    • line

    Related articles in KSEEG

    Economic and Environmental Geology

    pISSN 1225-7281
    eISSN 2288-7962
    qr-code Download