报告人:Mark Sanderson

 Mark Sanderson is Professor of Information Retrieval at RMIT University where he is head of the RMIT Information Retrieval (IR) group. Mark received his Ph.D. in Computer Science from the University of Glasgow, United Kingdom, in 1997. He has raised over $10 million dollars in grant income, published hundreds of papers, and has over 8,500 citations to his work. He has 25 current and/or past PhD students. In collaboration with one student, Mark was the first show the value of snippets, a component of search engine interfaces which are now a standard feature of all engines. One of Mark's papers was given an honourable mention at SIGIR's 2017 test of time awards. Mark has been co-editor of Foundations and Trends in Information Retrieval; associate editor of IEEE TKDE, ACM TOIS, ACM TWeb, and IP&M; and served on the editorial boards of IRJ and JASIST. Mark was general chair of ACM SIGIR in 2004. He was a PC chair of ACM SIGIR 2009 & 2012; and ACM CIKM 2017. Prof Sanderson is also a visiting professor at NII in Tokyo.

报告题目:Exploring a Cyber/Physical/Social Model of Context
报告摘要:  The primary challenge for a search engine is to cope with the underspecified queries or questions that users provide as input. Most of the research innovations in information retrieval have been attempts to cope with such an underspecified information need. One approach to trying to determine what is needed is to examine the context of the user when they search. At RMIT, we have been exploring a context that sits at the junction of the physical and the online. Our so-called cyber/physical/social context examines simultaneously what a user is doing online, where they are, and who they are with. I will describe the datasets, analysis, and modelling that we have undertaken to understand how this context can be exploited in a range of different scenarios. I will also describe some of the early work that we are conducting in a project with Microsoft Research to extend analysis of this context further.


 吴信东,明略数据首席科学家和副总裁,教授,国家“千人计划”入选者、海外杰青、长江学者、IEEE Fellow、AAAS Fellow。因为在数据挖掘及其应用领域的先驱性贡献(“for pioneering contributions to data mining and applications”),2012年获IEEE计算机学会技术进步奖。 KAIS (Knowledge and Information Systems) 主编和TKDD (ACM Transactions on Knowledge Discovery from Data)的联合主编,也是数据挖掘国际会议ICDM(IEEE International Conference on Data Mining)的创办人和指导委员会主席 。2005年1月至2008年12月,担任《IEEE Transactions on Knowledge and Data Engineering》(TKDE)两届主编。2014年获IEEE ICDM十年最有影响力论文奖,他指导的博士生2014年获中国计算机学会优秀博士学位论文奖。

报告摘要: 明察检索系统是一个实现多方位获取、全网络汇聚、多维度整合的海量数据资源融合体系平台,面向智能应用服务,汇聚有共享需求的政府部门数据、社会行业数据、互联网数据和物联网数据,从而促成公安数据资源池的汇聚。通过大数据技术进行多源数据融合,基于公安知识图谱,明察检索系统构建以PB级海量间的广义连接,支持动态索引、字段关联、ID融合、多维检索、时空比对和可视化展示,为基础警务信息提供查询分析,为布控预警、时空感知、比对碰撞等业务需求实现智能警务支持。

报告人:Grace Hui Yang

 Dr. Grace Hui Yang is an Associate Professor in the Department of Computer Science at Georgetown University. Dr. Yang is leading the InfoSense (Information Retrieval and Sense- Making) group at Georgetown University, Washington D.C., U.S.A.. Dr. Yang obtained her Ph.D. from the Language Technologies Institute, Carnegie Mellon University in 2011. Dr. Yang’s current research interests include deep reinforcement learning, dynamic information retrieval, search engine evaluation, privacy-preserving information retrieval, internet of things, and information organization. Prior to this, she have conducted research on question answering, ontology construction, near-duplicate detection, multimedia information retrieval and opinion and sentiment detection. Dr. Yang's research has been supported by the Defense Advanced Defense Advanced Research Projects Agency and the National Science Foundation. Dr. Yang is a recipient of the prestigious National Science Foundation (NSF) Faculty Early Career Development Program (CAREER) Award. Dr. Yang has co-chaired SIGIR 2013 and 2014 Doctoral Consortiums, SIGIR 2017 Workshop, WSDM 2017 Workshop, ICTIR 2017 Workshop, CIKM 2015 Tutorial, ICTIR 2018 Short Paper and SIGIR 2018 Demonstration Paper Program Committees. Dr. Yang served on the editorial board of Information Retrieval Journal from 2014 to 2017. She has served as an area chair/senior program committee member for SIGIR 2014-present, WSDM 2018-present, ECIR 2017 and for ACL 2016. Dr. Yang also co-organized the Text Retrieval Conference (TREC) Dynamic Domain Track from 2015 to 2017 and led the effort for SIGIR privacy-preserving information retrieval workshops from 2014 to 2016.

报告题目:Dynamic Search and Beyond
报告摘要: In modern Information Retrieval (IR), users, data, and systems are often highly interactive and exhibit dynamic characteristics which are ignored by conventional approaches. What is missing is an ability for the retrieval models to change over time and be responsive to stimuli in the environment. This talk presents our up-to-date research on statistical modeling of dynamic search. The talk introduces a range of retrieval models and evaluation techniques that dynamically adjust themselves based on the signals collected over long time spans from dynamic behaviors in documents, users, tasks and relevance judgements. The talk highlights how we model information seeking using a variety of reinforcement learning methods and achieve high accuracy in the TREC Session and TREC Dynamic Domain Tracks. The talk also gives perspectives on future directions in dynamic IR.

报告人:Hang Li

 Hang Li is a director of AI Lab, Bytedance Technology (also known as Toutiao), adjunct professors of Peking University and Nanjing University. He is an IEEE Fellow and an ACM Distinguished Scientist. His research areas include natural language processing, information retrieval, machine learning, and data mining. Hang graduated from Kyoto University in 1988 and earned his PhD from the University of Tokyo in 1998. He worked at NEC Research as researcher from 1990 to 2001, Microsoft Research Asia as senior researcher and research manager from 2001 to 2012, and chief scientist and director of Huawei Noah’s Ark Lab from 2012 to 2017. He joined Bytedance in 2017. Hang has published three technical books, and more than 120 technical papers at top international conferences including SIGIR, WWW, WSDM, ACL, EMNLP, ICML, NIPS, SIGKDD, AAAI, IJCAI, and top international journals including CL, NLE, JMLR, TOIS, IRJ, IPM, TKDE, TWEB, TIST. He and his colleagues’ papers received the SIGKDD’08 best application paper award, the SIGIR’08 best student paper award, the ACL’12 best student paper award. Hang worked on the development of several products such as Microsoft SQL Server 2005, Office 2007, Live Search 2008, Bing 2009, Office 2010, Bing 2010, Office 2012, Huawei smartphones 2014 and Huawei smartphones 2017. He has 42 granted US patents. Hang is also very active in the research communities and has served or is serving top international conferences as PC chair, Senior PC member, or PC member, including SIGIR, WWW, WSDM, ACL, NACL, EMNLP, NIPS, SIGKDD, ICDM, IJCAI, ACML, and top international journals as associate editor or editorial board member, including CL, IRJ, TIST, JASIST, JCST.

报告题目:Toward Building Self-Training Search Systems
报告摘要: It would be more desirable if a search system could automatically and continuously improve its performance by training itself during the process of serving users. At the search system, users submit queries and click documents which are relevant. The system records user behavior data particularly click data. Click data represents users’ implicit feedbacks to the system, which is intrinsically helpful for improving search relevance. On the other hand, the data is also noisy and biased. If the search system is sufficiently intelligent, it should be able to automatically mine valuable information from the data and identify ways to further improve its relevance. Recently a new direction in IR is arising and making progress, which is called unbiased learning to rank (ULTR). ULTR aims to automatically conduct debasing of click data and utilize the debiased click data to train a ranker in search. ULTR opens up new opportunities for building search systems that can perform self-training as described above. In this talk, I will give an introduction to ULTR, including our recent work on it. I will also discuss the challenges and opportunities with regard to this important technology.



 中国人民大学计算机副教授。研究领域为社交数据挖掘和自然语言处理,共发表论文60余篇。所发表的学术论文取得了一定的关注度,据Google Scholar统计,已发表论文共计被引用2000余次,其中以第一作者发表的《Comparing Twitter and Traditional Media Using Topic Models》被引用900余次。曾获得CIKM 2017最佳短文候选以及AIRS 2017最佳论文奖。入选第二届CCF青年人才发展计划。担任多个国际顶级期刊和学术会议评审,AIRS 2016出版主席、SMP 2017领域主席以及NLPCC 2017领域主席。

报告摘要: 在信息大数据时代,用户的个性化需求不断提高,对于信息系统智能度的要求带来了很多挑战。面对大量的数据信息,如何帮助用户有效获取所需要的信息,有力改善信息超载(information overload)问题,是数据科研工作者的主要研究挑战之一。整体来说,目前信息处理系统有两种工作模式:第一种称之为“拉”模式,比较典型的就是搜索引擎,用户提交查询,系统返回搜索结果;第二种称之为“推”模式,比较典型的就是推荐系统,用户不要求显式提交任何查询和兴趣偏好,而系统通过自动化算法来进行“信息”推送。在信息智能时代,推荐系统显得尤为重要,已经成为互联网以及数据服务公司的核心技术模块之一。本次报告将介绍推荐系统最近二十年的主要技术进展,特别地将主要介绍深度学习算法在该领域内的若干应用进展,最后展望推荐系统的主要技术挑战和未来研究方向。



报告摘要: 电子商务作为在线的销售平台,为消费者提供了多种维度的在线消费渠道。通过连接用户与商品,信息检索在电商中发挥着重要的作用。电商平台中的信息发现主要关注电商平台上不同类型的检索,推荐和电商中的自然语言处理中的应用与研究。近年来,围绕电商信息发现,一系列研究性的工作,包括研究论文,讲座和讨论组,被提出并在实际应用中发挥了越来越巨大的作用。总体来说,这些前瞻性的电商信息发现的研究工作主要聚焦于提升电商相关性搜索的性能,准确追踪用户行为,提升推荐系统性能,利用自然语言处理技术优化用户体验,和开发创新性的问答和对话系统来帮助更好的连接用户与商品。本次讲座将聚焦于阐述电子商务信息发现中的搜索和自然语言处理相关工作。