加入收藏夹
联系我们
关于本站
个人主页
西电导航
西安电子科技大学
    当前位置:首页>>学术讲座
Modeling Term Associations for Information Retrieval
时间:2017-08-16 16:07    点击:   所属单位:电子工程学院
讲座名称 Modeling Term Associations for Information Retrieval
讲座时间 2017-08-21 09:30:00
讲座地点 北校区主楼III区430
讲座人 Jimmy Xiangji Huang
讲座人介绍 Jimmy Huang received his Ph.D. in Information Science from City University in London, England and was then a Post Doctoral Fellow at the School of Computer Science, University of Waterloo in Canada. He also worked in the financial industry in Canada, where he was awarded a CIO Achievement Award. He joined York University in July 2003 as an assistant professor. He is now the Director & a Professor at the School of Information Technology, and the Director of Information Retrieval & Knowledge Management Research Lab (IRLab) of York University, where he is also cross-appointed as a graduate faculty member in the programs of Information Systems and Technologies, Computer Science and Engineering, Mathematics and Statistics, Health Informatics, and Emergency Management. Jimmy Huang received the Dean's Award for Outstanding Research in 2006, an Early Researcher Award, formerly the Premier's Research Excellence Awards in 2007, the Petro Canada Young Innovators Award in 2008, the SHARCNET Research Fellowship Award in 2009, the Best Paper Award at the 32nd European Conference on Information Retrieval (ECIR 2010) in UK, the 2010 Web Intelligence Consortium Outstanding Service Award and LA&PS Award for Distinction in Research, Creativity and Scholarship (Established Researcher) in 2015. Since 2003, he has published more than 170 refereed papers in journals (such as ACM TOIS, JASIST, IPM, IEEE TKDE, Information Sciences, IR, BMC Bioinformatics and BMC Genomics), book chapters and international conference proceedings (such as ACM SIGIR, ACM CIKM, COLING and IEEE ICDM). His Master (M.Eng.) and Bachelor (B.Eng.) degrees were in Computer Organization & Architecture and Computer Engineering respectively. He was awarded tenure and promoted to Full Professor at York University in 2006 and 2011 respectively. He was the General Conference Chair for the 19th International ACM CIKM Conference and the General Program Chair for IEEE/WIC/ACM International Joint Conferences on Web Intelligence & Intelligent Agent Technology in 2010. His research interests include information retrieval, big data analytics with complex structures, medical/health informatics, text/Web mining, natural language processing, bioinformatics and computational linguistics.
讲座内容 Traditionally, in many probabilistic retrieval models, query terms are assumed to be independent. Although such models can achieve reasonably good performance, associations can exist among terms from human being’s point of view. There are some recent studies that investigate how to model term associations/dependencies by proximity measures. However, the modeling of term associations theoretically under the probabilistic retrieval framework is still largely unexplored. In this talk, I will introduce a new concept named Cross Term, to model term proximity, with the aim of boosting retrieval performance. With Cross Terms, the association of multiple query terms can be modeled in the same way as a simple unigram term. In particular, an occurrence of a query term is assumed to have an impact on its neighboring text. The degree of the query term impact gradually weakens with increasing distance from the place of occurrence. We use shape functions to characterize such impacts. Based on this assumption, we first propose a bigram CRoss TErm Retrieval (CRTER2) model as the basis model, and then recursively propose a generalized n-gram CRoss TErm Retrieval (CRTERn) model for n query terms where n > 2. Specifically, a bigram Cross Term occurs when the corresponding query terms appear close to each other, and its impact can be modeled by the intersection of the respective shape functions of the query terms. For n-gram Cross Term, we develop several distance metrics with different properties and employ them in the proposed models for ranking. We also show how to extend the language model using the newly proposed cross terms. Extensive experiments on a number of TREC collections demonstrate the effectiveness of our proposed models.
转载请注明出处:西安电子科技大学学术信息网
如果您有学术信息或学术动态,欢迎投稿。我们将在第一时间确认并收录,投稿邮箱: meeting@xidian.edu.cn
Copyright © 2011-2017 西安电子科技大学 
开发维护:电子工程学院网络信息中心  管理员:meeting@xidian.edu.cn 站长统计: