Advance Search
LI Zequan, LIU Feixiang, ZHAO Jialiang, QI Hui, LI Jing. Construction of pre-training language model for coal mine safety hidden danger text[J]. Mining Safety & Environmental Protection. DOI: 10.19835/j.issn.1008-4495.20240109
Citation: LI Zequan, LIU Feixiang, ZHAO Jialiang, QI Hui, LI Jing. Construction of pre-training language model for coal mine safety hidden danger text[J]. Mining Safety & Environmental Protection. DOI: 10.19835/j.issn.1008-4495.20240109

Construction of pre-training language model for coal mine safety hidden danger text

  • At present, a large amount of unstructured text data accumulated by various safety management information platforms in coal mines has not been fully utilized. In order to fully explore the text knowledge of coal mine safety hidden danger,a pre-training language model (CoalBERT) based on the learning mechanism of domain term word-mask language modeling (DP-MLM) and sentence order predictive modeling (SOP) was proposed. The model was trained by the collected data of more than 1. 1 million records of coal mine hidden danger investigation and the self -constructed dictionary of 1 328 domain terms,and comparative experiments were conducted respectively on the two tasks of coal mine safety hidden danger text classification and named entity recognition. The research results show that in the text classification experiment,the accuracy rate, recall rate and F1 value of the overall results of the CoalBERT model are increased by 0. 34%,0. 21% and 0. 27% respectively compared with the pre-training model of the bidirectional encoder representation from transformers (BERT). In the named entity recognition experiments,the accuracy rate and F1 value of CoalBERT model are 3. 84% and 2. 13% higher than BERT model, respectively. The CoalBERT model can effectively enhance the text semantic understanding ability of coal mine safety hidden danger text and can provide a basic reference for text mining related task scenarios in the field of coal mine safety.
  • loading

Catalog

    Turn off MathJax
    Article Contents

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return