Research Paper

Split Viewer

Econ. Environ. Geol. 2021; 54(3): 353-364

Published online June 28, 2021

https://doi.org/10.9719/EEG.2021.54.3.353

© THE KOREAN SOCIETY OF ECONOMIC AND ENVIRONMENTAL GEOLOGY

Topic Model Analysis of Research Themes and Trends in the Journal of Economic and Environmental Geology

Taeyong Kim1, Hyemin Park1, Junyong Heo1, Minjune Yang2*

1Division of Earth Environmental System Sciences, Pukyong National University, Busan 48514, South Korea
2Department of Earth and Environmental Sciences, Pukyong National University, Busan 48514, South Korea

Correspondence to : *Corresponding author : minjune@pknu.ac.kr

Received: April 17, 2021; Revised: June 14, 2021; Accepted: June 16, 2021

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided original work is properly cited.

Abstract

Since the mid-twentieth century, geology has gradually evolved as an interdisciplinary context in South Korea. The journal of Economic and Environmental Geology (EEG) has a long history of over 52 years and published interdisciplinary articles based on geology. In this study, we performed a literature review using topic modeling based on Latent Dirichlet Allocation (LDA), an unsupervised machine learning model, to identify geological topics, historical trends (classic topics and emerging topics), and association by analyzing titles, keywords, and abstracts of 2,571 publications in EEG during 1968-2020. The results showed that 8 topics (‘petrology and geochemistry’, ‘hydrology and hydrogeology’, ‘economic geology’, ‘volcanology’, ‘soil contaminant and remediation’, ‘general and structural geology’, ‘geophysics and geophysical exploration’, and ‘clay mineral’) were identified in the EEG. Before 1994, classic topics (‘economic geology’, ‘volcanology’, and ‘general and structure geology’) were dominant research trends. After 1994, emerging topics (‘hydrology and hydrogeology’, ‘soil contaminant and remediation’, ‘clay mineral’) have arisen, and its portion has gradually increased. The result of association analysis showed that EEG tends to be more comprehensive based on ‘economic geology’. Our results provide understanding of how geological research topics branch out and merge with other fields using a useful literature review tool for geological research in South Korea.

Keywords Latent Dirichlet Allocation, topic modeling, trend analysis, association analysis, economic and environmental geology

기계학습 기반 토픽모델링을 이용한 학술지 “자원환경지질”의 연구주제 분류 및 연구동향 분석

김태용1 · 박혜민1 · 허준용1 · 양민준2*

1부경대학교 지구환경시스템과학부
2부경대학교 지구환경과학과

요 약

국내 지질학의 연구 분야는 20세기 중반 이후부터 꾸준하게 발전되어왔다. 학술지 “자원환경지질”은 국내 지질학을 대표하는 역사가 긴 학술지로 지질학을 바탕으로 하는 융복합연구 논문이 게재되고 있다. 본 연구는 학술지 “자원환경지질”에 게재된 논문을 대상으로 문헌 고찰(literature review)을 수행하여 지질학의 역사와 발전에 대해 논의하고자 한다. 1968년부터 2020년까지 총 2,571편의 논문 제목, 주제어, 다국어 초록을 수집하였으며, Latent Dirichlet Allocation (LDA) 기반 토픽모델링을 실시하여 연구주제를 분류하고 연구 동향과 주제간 연관성을 확인하였다. 학술지 “자원환경지질”은 총 8개의 연구주제(‘암석학 및 지구화학’, ‘수문학 및 수리지질학’, ‘광상학’, ‘화산학’, ‘토양오염 및 복원학’, ‘기초지질 및 구조지질학’, ‘지구물리 및 물리탐사’, ‘점토광물’)로 분류할 수 있었다. 1994년 이전에는 ‘광상학’, ‘화산학’, ‘기초지질 및 구조지질학’의 연구주제들이 활발하게 연구되었으며, 이후 ‘수문학 및 수리지질학’, ‘토양오염 및 복원학’, ‘지구물리 및 물리탐사’, ‘점토광물’의 연구주제들이 성행하였다. 연관성분석(network analysis)결과, 학술지 “자원환경지질"은 ‘광상학’을 기반으로 융복합적 연구 논문들이 게재되었다는 것을 확인하였다. 본 연구의 결과는 지질학을 다루는 연구자들에게 문헌고찰의 새로운 방법론을 제시하여 지질학의 역사에 대한 이해를 제공했음에 의의가 있다.

주요어 토픽 모델링, 잠재 디리클레 할당, 연구 동향 분석, 연관성 분석, 자원환경지질

Article

Research Paper

Econ. Environ. Geol. 2021; 54(3): 353-364

Published online June 28, 2021 https://doi.org/10.9719/EEG.2021.54.3.353

Copyright © THE KOREAN SOCIETY OF ECONOMIC AND ENVIRONMENTAL GEOLOGY.

Topic Model Analysis of Research Themes and Trends in the Journal of Economic and Environmental Geology

Taeyong Kim1, Hyemin Park1, Junyong Heo1, Minjune Yang2*

1Division of Earth Environmental System Sciences, Pukyong National University, Busan 48514, South Korea
2Department of Earth and Environmental Sciences, Pukyong National University, Busan 48514, South Korea

Correspondence to:*Corresponding author : minjune@pknu.ac.kr

Received: April 17, 2021; Revised: June 14, 2021; Accepted: June 16, 2021

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided original work is properly cited.

Abstract

Since the mid-twentieth century, geology has gradually evolved as an interdisciplinary context in South Korea. The journal of Economic and Environmental Geology (EEG) has a long history of over 52 years and published interdisciplinary articles based on geology. In this study, we performed a literature review using topic modeling based on Latent Dirichlet Allocation (LDA), an unsupervised machine learning model, to identify geological topics, historical trends (classic topics and emerging topics), and association by analyzing titles, keywords, and abstracts of 2,571 publications in EEG during 1968-2020. The results showed that 8 topics (‘petrology and geochemistry’, ‘hydrology and hydrogeology’, ‘economic geology’, ‘volcanology’, ‘soil contaminant and remediation’, ‘general and structural geology’, ‘geophysics and geophysical exploration’, and ‘clay mineral’) were identified in the EEG. Before 1994, classic topics (‘economic geology’, ‘volcanology’, and ‘general and structure geology’) were dominant research trends. After 1994, emerging topics (‘hydrology and hydrogeology’, ‘soil contaminant and remediation’, ‘clay mineral’) have arisen, and its portion has gradually increased. The result of association analysis showed that EEG tends to be more comprehensive based on ‘economic geology’. Our results provide understanding of how geological research topics branch out and merge with other fields using a useful literature review tool for geological research in South Korea.

Keywords Latent Dirichlet Allocation, topic modeling, trend analysis, association analysis, economic and environmental geology

기계학습 기반 토픽모델링을 이용한 학술지 “자원환경지질”의 연구주제 분류 및 연구동향 분석

김태용1 · 박혜민1 · 허준용1 · 양민준2*

1부경대학교 지구환경시스템과학부
2부경대학교 지구환경과학과

Received: April 17, 2021; Revised: June 14, 2021; Accepted: June 16, 2021

요 약

국내 지질학의 연구 분야는 20세기 중반 이후부터 꾸준하게 발전되어왔다. 학술지 “자원환경지질”은 국내 지질학을 대표하는 역사가 긴 학술지로 지질학을 바탕으로 하는 융복합연구 논문이 게재되고 있다. 본 연구는 학술지 “자원환경지질”에 게재된 논문을 대상으로 문헌 고찰(literature review)을 수행하여 지질학의 역사와 발전에 대해 논의하고자 한다. 1968년부터 2020년까지 총 2,571편의 논문 제목, 주제어, 다국어 초록을 수집하였으며, Latent Dirichlet Allocation (LDA) 기반 토픽모델링을 실시하여 연구주제를 분류하고 연구 동향과 주제간 연관성을 확인하였다. 학술지 “자원환경지질”은 총 8개의 연구주제(‘암석학 및 지구화학’, ‘수문학 및 수리지질학’, ‘광상학’, ‘화산학’, ‘토양오염 및 복원학’, ‘기초지질 및 구조지질학’, ‘지구물리 및 물리탐사’, ‘점토광물’)로 분류할 수 있었다. 1994년 이전에는 ‘광상학’, ‘화산학’, ‘기초지질 및 구조지질학’의 연구주제들이 활발하게 연구되었으며, 이후 ‘수문학 및 수리지질학’, ‘토양오염 및 복원학’, ‘지구물리 및 물리탐사’, ‘점토광물’의 연구주제들이 성행하였다. 연관성분석(network analysis)결과, 학술지 “자원환경지질"은 ‘광상학’을 기반으로 융복합적 연구 논문들이 게재되었다는 것을 확인하였다. 본 연구의 결과는 지질학을 다루는 연구자들에게 문헌고찰의 새로운 방법론을 제시하여 지질학의 역사에 대한 이해를 제공했음에 의의가 있다.

주요어 토픽 모델링, 잠재 디리클레 할당, 연구 동향 분석, 연관성 분석, 자원환경지질

    Fig 1.

    Figure 1.Total number of publications per year and cumulative number of publications from 1968 to 2020 published in Economic and Environmental Geology (EEG). The turn point year (1994) when the journal was renamed is shown with a black dashed line.
    Economic and Environmental Geology 2021; 54: 353-364https://doi.org/10.9719/EEG.2021.54.3.353

    Fig 2.

    Figure 2.Flowchart of data processing for topic modeling from Journal of Economic and Environmental Geology.
    Economic and Environmental Geology 2021; 54: 353-364https://doi.org/10.9719/EEG.2021.54.3.353

    Fig 3.

    Figure 3.Schematics of Latent Dirichlet Allocation (LDA) graphical model. Observed node (words of the documents) is shaded as grey. Rectangles are plate notation, which denotes replication.
    Economic and Environmental Geology 2021; 54: 353-364https://doi.org/10.9719/EEG.2021.54.3.353

    Fig 4.

    Figure 4.Significant classic and emerging topics sorted by the slope of the simple linear regression model (* represents p = 0.05, **represents p = 0.01, *** represents p = 0.001, and **** represents p = 0.0001).
    Economic and Environmental Geology 2021; 54: 353-364https://doi.org/10.9719/EEG.2021.54.3.353

    Fig 5.

    Figure 5.Time series evolution of (a) classic topics and (b) emerging topics. Turn point, 1994 (dashed grey vertical line), serves as the dividing year between the two time periods.
    Economic and Environmental Geology 2021; 54: 353-364https://doi.org/10.9719/EEG.2021.54.3.353

    Fig 6.

    Figure 6.Network structure of topics across two periods ((a) before 1994 and (b) after 1994). Nodes represent research topics and edges represent the degree of co-occurrence in abstracts and keyword sections.
    Economic and Environmental Geology 2021; 54: 353-364https://doi.org/10.9719/EEG.2021.54.3.353

    Table 1 . Notation of variables and parameters used in this study.

    NotationValueDescription
    Indices
    dIndex of documents
    kIndex of topics
    tIndex of years
    In LDA
    α0.01Dirichlet prior on the per-document topic distribution (hyperparameter)
    η0.1Dirichlet prior on the per-topic word distribution (hyperparameter)
    θdTopic distribution of document d
    θd,kDistribution of topic k in document d
    βkWord distribution of topic k
    zd,nTopic assignment from θd in word n
    wd,nObserved word n in document d
    K10Number of topics
    D2,571Number of documents
    V17,242Number of unique words in the LDA vocabulary
    N314,248Total number of words in all documents

    Table 2 . Topics and top-10 related words per topics.

    Topic (ratio)Topic labelKeyword (top-10)
    Topic 1(9.46%)Petrology and geochemistry granite element, foliated, rock, ppm, uranium, asbestos, content, geochemical, area
    Topic 2(12.08%)Hydrology and hydrogeology water, groundwater, sample, ph, reaction, metal, concentration, river, sample, stream
    Topic 3(25.08%)Economic geology stage, fluid, deposit, vein, ore, mineral, inclusion, skarn, temperature, quartz
    Topic 4(10.58%)Volcanology granite, rock, age, volcanic, tuff, formation, lava, island, alkali, basalt
    Topic 5(10.35%)Soil contaminant and remediation soil, contaminant, heavy, metal, groundwater, water, weather, removal, cd, pb
    Topic 6(11.72%)General and structural geology basin, formation, fault, okcheon, age, fold, structure, group, tectonic, direction
    Topic 7(13.02%)Geophysics and geophysical exploration fault, landslide, gravity, magnet, seismic, model, velocity, map, slope, anomaly
    Topic 8(7.72%)Clay mineral clay, layer, kaolin, kaolinite, medicine, mineral, deposit, rock, zone, heat

    KSEEG
    Feb 29, 2024 Vol.57 No.1, pp. 1~91

    Stats or Metrics

    Share this article on

    • kakao talk
    • line

    Related articles in KSEEG

    Economic and Environmental Geology

    pISSN 1225-7281
    eISSN 2288-7962
    qr-code Download