中文自然语言处理开放平台

首页 | 资源 | 项目 | 帮助   

Search       


 用户(guest)

  申请加入
  添加文档

 Language


Home > 关键技术 >词法分析 >
计算所汉语词法分析系统ICTCLAS
  
词是最小的能够独立活动的有意义的语言成分,但汉语是以字为基本的书写单位,词语之间没有明显的区分标记,因此,中文词法分析是中文信息处理的基础与关键。为此,我们中国科学院计算技术研究所在多年研究基础上,耗时一年研制出了基于多层隐马模型的汉语词法分析系统ICTCLAS(Institute of Computing Technology, Chinese Lexical Analysis System),该系统的功能有:中文分词;词性标注;未登录词识别。分词正确率高达97.58%(最近的973专家组评测结果),基于角色标注的未登录词识别能取得高于90%召回率,其中中国人名的识别召回率接近98%,分词和词性标注处理速度为543.5KB/s。
ICTCLAS和计算所其他14项免费发布的成果被中外媒体广泛地报道,截止到9月,ICTCLAS被来自于中国、日本、新加坡、韩国、美国以及其他国家和地区的30000多位研究人员和商业机构下载使用。我们为免费发布ICTCLAS并能帮助用户解决中文词法问题而深感荣幸!
计算所汉语词法分析系统ICTCLAS同时还提供一套完整的动态连接库ICTCLAS.dll,COM组件和相应的概率词典,开发者可以完全忽略汉语词法分析,直接在自己的系统中调用ICTCLAS,ICTCLAS可以根据需要输出多个高概率的结果,输出格式也可以定制,开发者在分词和词性标注的基础上继续上层开发。
欢迎相关领域的工程技术人员、研究人员使用,并提供宝贵意见。
Word is the minimum meaningful unit of languages. It’s well known that there are no separators between words in Chinese text. Therefore, Chinese lexical analysis is a prerequisite to Chinese information processing. Based on years of research, we have developed a Chinese lexical analysis system ICTCLAS (Institute of Computing Technology, Chinese Lexical Analysis System) using an approach based on multi-layer HMM. ICTCLAS includes word segmentation, Part-Of-Speech tagging and unknown words recognition. Its segmentation precision is 97.58%(result from recent official evaluation in national 973 project). The recalling rates of unknown words recognized using roles tagging achieve more than 90%. Especially, the recalling of Chinese person names achieve nearly 98%. The speed for word segmentation and POS tagging is 543.5KB/s.
ICTCLAS and other 14 free systems from Institute of Computing Technology were broadly reported in China and abroad as well. Until Sep., ICTCLAS had been downloaded by over 30,000 researchers or commercial organizations from China, Japan, Singapore, Korea, USA and other countries or areas. We are honored to distribute ICTCLAS free of fees and help users solve problems from Chinese lexical analysis.
In addition, we provide ICTCLAS.dll for developers invoking in their own systems. Any question, comments or advice about ICTCLAS are welcomed.
Author:Kevin Zhang (张华平)
Institute:Institute of Computing Technology, Chinese Academy of Sciences
Email:zhanghp@software.ict.ac.cn
Tel: +86-10-88449181转718

 建立时间 2002-08-16 11:26:42
 许可证方式 自然语言处理开放资源许可证
 运行环境 Win9X, Win2000, Win NT, Win XP, Linux
 程序语言 C/C++

项目程序  >>进入 Total: 4 items
  • 汉语词法分析系统ICTCLAS_1.0(5098 Downloads,2004-06-09 11:26:24)

  • 汉语词法分析系统ICTCLAS_1.0-Linux版(368 Downloads,2007-04-19 15:56:13)

  • ICTCLAS_revised_by_HuangJin(246 Downloads,2007-07-12 14:43:05)

  • ICTCLAS V1.2(610 Downloads,2007-08-31 16:38:02)

  • 项目说明  >>进入 Total: 6 items
  • ICTCLAS自评结果(8.1)(2280 Downloads,2002-09-14 19:55:19)

  • ICTCLAS在国家973英汉机器翻译第二阶段评测的评测报告(1582 Downloads,2002-09-14 19:54:26)

  • ICTCLAS.dll的函数调用示例(Dephi)(2385 Downloads,2002-09-12 09:45:14)

  • ICTCLAS.dll的函数调用示例(C语言)(6955 Downloads,2002-09-12 09:42:04)

  • ICTCLAS.dll的函数接口(9694 Downloads,2002-09-10 09:14:59)

  • ICTCLAS使用说明(9859 Downloads,2002-09-10 09:11:33)

  • 论文  >>进入 Total: 4 items
  • Chinese Name Entity Recognition Using Role Model(5089 Downloads,2004-03-19 09:15:12)

  • Chinese Lexical Analysis Using HHMM-ACL2003\\HHMM-based Chinese Lexical Analyzer ICTCLAS(62 Downloads,2004-03-19 09:10:55)

  • Chinese Lexical Analysis Using HHMM-ACL2003\\Chinese Lexical Analysis Using Hierarchical Hidden Markov Model(3219 Downloads,2004-03-19 09:09:58)

  • 基于N-最短路径方法的中文词语粗分模型(10465 Downloads,2002-09-06 13:34:51)

  • 文档  >>进入 Total: 9 items
  • ICTCLAS报告(2109 Downloads,2007-07-12 14:44:24)

  • ICTCLAS学习笔记(250 Downloads,2007-07-12 14:43:49)

  • Chinese Lexical Analyzer ICTCLAS2.6 API Manual(3296 Downloads,2003-11-18 12:11:52)

  • ICTCLAS的BUG报告(2151 Downloads,2002-10-17 19:41:48)

  • 系统修正日志(958 Downloads,2002-10-17 10:47:02)

  • ICTCLAS的授权策略(5495 Downloads,2002-09-14 20:18:49)

  • 汉语文本词性标注标记集(3504 Downloads,2002-09-10 09:12:04)

  • Coling2002综述(2743 Downloads,2002-09-09 18:19:19)

  • 中文词语分析一体化系统(10364 Downloads,2002-09-06 13:40:08)

  • 演示系统  >>进入 Total: 1 items
  • ICTCLAS3.0 API(5635 Downloads,2007-04-09 16:07:33)

  • 项目管理员
      Pipy

    项目成员
      patch
    项目新闻
    2007-7-12更新
    2007-07-12 14:41:22

    2007-7-12更新
    2007-07-12 14:41:17

    2007-7-12更新
    2007-07-12 14:39:39

    欢迎大家在线测试汉语词法分析系统ICTCLAS
    2002-09-24 21:24:21