并行大模型驱动的多模态骨签文物分类

doi:10.15888/j.cnki.csa.009983

AIPUB归智期刊联盟

微信公众号

网站二维码

首页 > 过刊浏览>2025年第34卷第11期 >139-150. DOI:10.15888/j.cnki.csa.009983

PDF HTML阅读 XML下载导出引用引用提醒

并行大模型驱动的多模态骨签文物分类
DOI:
                        10.15888/j.cnki.csa.009983
                    
CSTR:
                        32024.14.csa.009983
                    
作者:
                        
                        
                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:国家社科基金冷门绝学研究专项(20VJXT001)

Parallel LLM-driven Multimodal Classification for Bone Stick Artifacts

Author:

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

汉朝未央宫遗址出土的约6万片骨签碎片中, 约5.7万片刻有释文, 多数骨签在出土时呈纵向断裂状态, 导致其上下部分分离, 对文物的数字化保护及系统化分类工作带来了挑战. 传统人工分类方法不仅效率低下且可能对骨签造成进一步的损伤, 为提升骨签文物的分类精度, 为后续考古研究提供支持, 本文提出了一种融合骨签图像与释文信息的并行多模态分类模型. 该方法采用Vision-RWKV大模型提取骨签图片的视觉特征, 利用RWKV大模型提取骨签上的释文信息, 通过动态交叉特征融合模块整合图像与文本特征, 并引入分类器进行精细化分类. 实验结果表明该方法达到了92.85%的准确率, 显著优于传统深度学习模型和其他多模态大模型. 研究成果为骨签文物的高效分类与整理提供了有力的技术支撑, 并为考古领域的智能化研究奠定了重要基础.

Abstract:

Among the approximately 60000 bone stick fragments excavated from the Weiyang Palace ruins of the Han Dynasty, around 57000 are inscribed. Most bone sticks exhibit longitudinal fractures at the time of excavation, resulting in the separation of their upper and lower parts. This fragmentation poses significant challenges to the digital preservation and systematic classification of these cultural artifacts. Traditional manual classification methods are inefficient and may cause further damage to the bone sticks. To address these challenges, this study proposes a parallel multimodal classification model that integrates both bone stick images and inscription information. Visual features are extracted from the images using the Vision-RWKV large-scale model, while textual features are obtained from the inscriptions via the RWKV model. A dynamic cross-modal feature fusion module is introduced to integrate image and text features, followed by a classifier for fine-grained categorization. Experimental results demonstrate that the proposed method achieves an accuracy of 92.85%, significantly outperforming conventional deep learning models and other multimodal approaches. This study provides a robust technical foundation for the efficient classification and organization of bone stick artifacts and establishes a solid basis for the intelligent development of archaeological research.

参考文献

相似文献

引证文献

引用本文

范涛,王慧琴,王可,刘瑞,王展,毛力.并行大模型驱动的多模态骨签文物分类.计算机系统应用,2025,34(11):139-150

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2025-03-20
最后修改日期:2025-04-11
录用日期:
在线发布日期: 2025-09-18
出版日期:

微信公众号

网站二维码

引用本文

分享

相关视频

文章指标

历史

文章二维码