第十六届中国R会议(北京)

张志华

现代人工智能的本质、途径和未来方向

张志华,北京大学数学科学学院教授,北京大学计算机学院兼职教授和博士生导师,主要从事统计学、机器学习与理论计算机科学领域的研究和教学。曾多次担任NeurIPS、 ICML、ICLR 等国际重要人工智能和机器学习会议领域主席,是国际机器学习旗舰刊物Journal of Machine Learning Research的执行编委以及CSIAM Transactions on Applied Mathematics编委,中国现场统计研究会机器学习分会理事长。已在JMLR、AI、AOS、MP等期刊以及COLT、NeurIPS、ICML、ICLR、IJCAI、AAAI、AISTATS、UAI、MLSys、KDD、CVPR、ACL、EMNLP等会议发表论文100多篇,并著有《深度强化学习》,组织翻译了《深度学习》和《人工智能:现代方法》经典教材。开放有《机器学习导论》、《统计机器学习》、《深度学习》和《强化学习》等网上公开课。

现代人工智能是通过机器学习及由其驱动而发展起来的计算机视觉、自然语言处理和语音识别技术来实现多模态数据融合的现实交互。人工智能主要处理识别、决策和生成三大任务,这和机器学习的三大学习范式有监督学习、强化学习和无监督学习相一致。数学上,人工智能是试图求解具有组合结构的高维复杂问题。该报告将试图讨论人工智能的技术路线、一些未来研究方向和我国的发展现状。

肖诗汉

昇思MindSpore技术创新进展和超大规模AI实践

肖诗汉,清华大学计算机系博士,昇思MindSpore大模型训练技术专家。在AI技术应用于网络基础设施方向有多项研究成果与创新落地,并在NSDI、JSAC、TON等相关领域国际顶级会议及期刊发表论文三十余篇。主要从事大规模AI训练与推理系统方向,致力于实现大规模AI训练与推理系统优化的自动化与极致性能。

以大模型、大数据、大算力驱动的超大规模AI正成为迈向通用强人工智能最有潜力的技术方向,为此昇思MindSpore构建了多维度混合并行、多维度内存优化、图算融合等关键技术,并从2021年开始,孵化了20+大模型,6个千亿参数规模以上的大模型,联合知名高校及科研院所协同创新,发布了彭城实验室NLP大模型、中科院紫东太初图文音多模态大模型、鹏程神农蛋白质结构大模型、武大珞珈遥感大模型等。本次报告将从MindSpore大模型实践出发,介绍MindSpore超大规模AI关键技术,以及如何使用这些关键技术训练大模型,助力产业发展。

董迪

影像组学及其在临床中的应用

董迪,中国科学院自动化研究所研究员,博导,基金委优青;北京癌症防治学会胃癌防治专业委员会常务委员,全国医学影像领域学者论文学术影响力排名(2012~2021) Top 100学者,2022年全球前2%高引用科学家,中国图象图形学会“青年科学家”获得者,全国胃癌学术大会“未来科学家”,国家科技部重点研发计划青年科学家项目负责人。长期从事肿瘤人工智能分析的研究工作,近年来在医学领域主流SCI期刊Annals of Oncology (SCI IF: 50.5,2篇),European Respiratory Journal (SCI IF: 24.3)等上发表论文100余篇,ESI Top 1%高被引论文12篇,谷歌H因子53,研究被纳入《中国临床肿瘤学会CSCO胃癌诊疗指南》。

近年来人工智能方法的迅速发展和医学影像数据的急剧增长催生了医工交叉领域的影像组学技术。影像组可从计算机断层扫描成像(CT),正电子发射断层扫描成像(PET)、磁共振成像(MRI)、超声成像(US)等影像大数据中挖掘出反映疾病分子细胞水平改变的高维量化信息,并融合临床信息进行疾病的辅助诊断、疗效评估和预后预测,在临床诊疗领域显示出广阔的应用前景。
影像组学在肿瘤方面的应用较多,比如辅助胃癌隐匿性腹膜转移判断、肺癌免疫治疗疗效评估、鼻咽癌预后预测等。此外,影像组学还被应用于肝纤维化分期诊断、孕早期胎儿唐氏筛查等非肿瘤的诊断中。这些典型应用显示影像组学可以利用现有影像数据,辅助提高诊疗效果,从而使患者获益。影像组学的临床应用呈现百花齐放的同时,影像组学新方法的探索也在不断推进,其中以影像病理融合、裸数据智能分析等为代表的新方法不断涌现。
本报告主要围绕影像组学新技术及其在临床中的应用进行介绍,从影像组学背景、影像组学应用、影像组学关键技术三个方面展开。在典型应用方面,将介绍影像组学在疾病诊断、疗效评估、以及预后预测等方面的临床应用;在关键技术方面,将详细介绍近两年新出现的影像组学新方法。

俞声

大模型在知识图谱建设中的应用

清华大学统计学研究中心长聘副教授,博士生导师,研究生教学主管,国家青年拔尖人才获得者。俞声长期从事医学自然语言处理、人工智能与电子病历分析技术研究。俞声开发的电子病历自然语言处理系统被10个国家和地区的医学研究机构使用。俞声发明的高通量表型提取技术使i2b2疾病表型识别算法开发速度从每年1-2个疾病提高到每年超过1000个疾病,并应用于“Million Veteran Program”等美国国家级精准医学研究项目以及多家医院的生物样本库建设;该系列论文获评医学信息学顶刊JAMIA的编辑选择奖、国际医学信息学学会2019年年鉴最佳论文奖,并按标准化生物医学实验方法发表于Nature Protocols。归国后,俞声获得多项国家基金和人才项目支持,带领团队围绕中文电子病历和智能诊疗发展了高通量知识图谱构建、电子病历分析、生物医学机器翻译、临床诊断决策支持等一系列技术,并与IDEA研究院合作,指导开发了大规模开放生物医学知识图谱BIOS,为医疗行业大数据处理与人工智能开发建立公共基础。

大型语言模型的出现带来了自然语言处理乃至更广泛的人工智能能力的大幅跃升。在全自动生物医学知识图谱建设中,同义术语识别和关系提取作为建造图谱的点和边的关键技术,一直以来都存在着难以突破的天花板,其症结在于这些问题的判断必须结合所涉及对象的复杂的背景知识。大模型以参数形式隐式承载的庞大的知识量,为突破这些难题提供了关键助力。本报告结合世界最大的单体生物医学知识图谱BIOS的建造,分别介绍通过对术语相关知识进行对比学习从而实现3500万条术语按同义词聚类的工作,和利用大模型做深层次阅读理解实现高准确率医学关系提取的工作。

周静

一种集成3D CNN模型用于肺腺癌的病理亚型识别

周静,中国人民大学统计学院副教授、博士研究生导师,应用统计科学研究中心研究员,中国现场统计研究会统计交叉科学研究分会常务理事,北京大学光华管理学院博士,研究方向为网络数据建模、人工智能在肺癌诊疗中的应用、医学图像分析等,在Npj digital medicine, BIBM, Journal of Business & Economic Statistics,Statistic Sinica,Computational Statistics and Data Analysis,Neurocomputing,Science China Mathematics等nature子刊,统计学权威期刊发表论文二十余篇,授权发明专利3项,著有专著《社交网络数据:理论与实践》一本,主编两本深度学习教材《深度学习:基于Pytorch的实现》、《深度学习:从入门到精通》,主持国家自然科学基金(青年、面上)、北京市社会科学基金、国家统计局重大、重点等多项国家级、省部级以上课题。

肺癌一直是全球威胁人类健康最常见的癌症之一,它也是导致癌症相关死亡的主要原因,约占全部癌症相关死亡的18%。本研究从实际的临床问题出发,系统解决了在临床实践中,肺结节诊断的三个重点问题。分别是:
1、结节是良性的还是恶性的?该问题的解决能回答患者要不要做手术的问题。
2、如果结节是恶性的,这是一个浸润前的病变还是浸润性的病变?该问题的解决能回答患者,可以选择什么时候做手术。
3、如果是浸润性的病变,那么它的风险等级是什么?(例如是高分化、中分化还是低分化),该问题的解决能回答患者,如果手术,可以采取何种手术方式。
我们将以上三个问题称之为肺结节诊断的3W问题(Whether,When and Which),为了解决以上问题,本研究提出了一个三阶段的EMV-3D-CNN模型。我们的模型在诊断良/恶性结节和浸润前病变/浸润性病变方面分别获得了91.3%和92.9%的AUC。尤其值得注意的是,我们的模型在浸润性腺癌的风险分级预测方面要优于影像科医生的判断,准确率达到了77.6%(即对浸润性腺癌进行的高、中、低分化三个分类)。最后,为了方便医生和患者访问,我们将提出的模型实现为基于Web的系统(https://seeyourlung.com.cn)。

张晨

基于人工智能的M蛋白血症辅助诊断系统

张晨,清华大学工业工程系副教授。主要研究方向为基于统计学与人工智能的工业数据分析,包括复杂数据,如函数型数据、高维变量数据、网络数据、时间序列数据等的建模、因果推断与在线监控算法,研究成果发表在IISE Transactions, Journal of Quality Technology, IEEE Transactions,SIGKDD,AAAI,IJCAI等。研究成果获得美国质量协会ASQ、国际工业系统工程协会IISE、电气与电子工程师协会IEEE、运筹学和管理学研究协会INFORMS最佳论文奖十余项,获得教育部科学技术进步二等奖。主持完成国家自然科学基金项目2项、省部级和企业课题10余项,入选中国科协青年人才托举工程。

M蛋白是诊断浆细胞疾病的重要指标,其检测主要通过免疫固定电泳(IFE)技术来实现。IFE图像中致密带的特征决定了M蛋白的分型结果,传统的诊断依赖于医学专家的专业判读,具有较高的人力成本和主观性误差。为了解决这些问题,本项目开发了基于人工智能的M蛋白计算机辅助诊断系统,基于IFE图像和临床指标数值两种异构的数据,提出多模态自监督学习算法。该算法首先利用掩码自编码器对无标签图像进行预训练,并使用有标签图像进行参数微调,然后通过特征工程对指标数值进行预处理并构造相关系数矩阵,再采用卷积神经网络提取特征,最后对两种模态的信息化特征进行融合,挖掘了不同模态之间特征的互补性。算法在真实大规模数据集上得以验证,取得了非常好的诊断结果。

朱超杰

carbonGPT:大模型在碳金融中的应用

朱超杰,毕业于北京航空航天大学,现任光明实验室人工智能算法研究员,负责团队大语言模型的应用研究,曾任职于微众银行AI项目部

借助当前大语言模型技术,解析和分析碳汇领域的文档及政策。同时,将介绍如何利用RAG (Retrieval-Augmented Generation)技术优化大模型对实时数据的处理,以提供准确、及时的碳市场信息,为碳金融市场的发展提供技术支持。

郭晓波

智能模型在银行数字化营销场景中的若干应用

郭晓波博士,民生银行数据管理部高级经理,长年深耕支付、信贷、理财及综合金融等业务,基于客户营销与服务、风险控制、运营增效、产品迭代与体验优化等场景和领域,构建面向企业级数据画像、搜推广营与智能风控等方向的数字化营销及数字化风控等服务体系,担任北京金融科技产业联盟高级专家以及多所高校企业导师,40多项专利、30项目研究成果发表在JMLR/KDD/TKDE/TOIS/TKDD/AIRE/CVPR/ICDE/WSDM等国际顶级期刊与会议。

本演讲结合行业内外发展现状与趋势,重点介绍智能模型在银行数字化营销场景中的若干应用。
- 行业发展趋势判断与讨论;
- 智能模型在商业银行数字化营销场景中的若干应用;
- 结合当下金融行业数字化转型背景,浅谈数字化营销实践中遇见的各类问题。

金逸飞

人工智能在量化投资中的应用

北京大学物理学院本科,清华大学交叉信息研究院博士,清华大学计算机系博士后,现任交叉科技(北京)有限公司 CEO,南京恒生交叉智能信息技术有限公司 CTO。研究方向为算法设计、机器学习和量化投资。解决了计算几何领域两个开放了近二十年的未解难题:带权单位圆集合覆盖及Yao-Yao图跨度性问题。首次将机器学习的经典算法mu-SVM的时间复杂度提高到近似线性。主持参与了多个人工智能量化项目包括:清华大学-易方达人工智能投资组合构建项目、中国平安车险反欺诈项目、国信证券智能债券违约预警、恒生电子量化算法等多个项目的研发和落地等。

机器学习与人工智对金融领域有着革命性的改变,被广泛应用于智能客服、投资研究、算法交易、投资顾问、合规/风控等多个领域。人工智能在量化投资中的应用涉及因子研究、择时与行业轮动、风险控制和组合构建、微观结构、交易执行优化等多个方面。我们将结合自身的项目实践,探讨人工智能在量化投资多个典型场景的建模方案,包括时空大数据深度学习框架、量化因子挖掘算法、样本与特征选择算法以及另类数据处理等多个方面的实践成果。

童毅炜

ModelOps 在数据科学平台的实践与应用

童毅炜,和鲸科技产品副总监,关注ModelOps在开放科研、AI for Science等场景下的应用与实践。

数据科学研究的复现往往依赖于数据、环境和代码,大模型兴起之后,部分研究对于算力的依赖也更为紧密。基于 ModelOps 的视角,我们可以利用云计算相关技术对算力、环境、数据等科研资料进行更有效的管理与整合。与此同时,算法模型作为相对较新的研究产出,其作为成果的生命周期管理和内部自制科研工具的流转,也应该纳入科研模型开发应用平台的考量。

莫欣

智能体(Agent)构建和应用开发探讨

莫欣,13年大数据平台、大数据应用产品、商业分析、研发工具等领域相关经验,产研运全栈,前光年之外开发者生态产品专家、前美团资深数据产品专家,参与过多项大型互联网公司工程平台型产品的产品设计、架构设计、产品运营工作。

1. 探讨应用层面从大模型到智能体的演变过程
2. 介绍智能体的核心构成
3. 介绍基于Agently框架的智能体能力应用案例
4. 介绍基于Agently框架开发智能体的思想和方法

覃文锋

Polars - Modern DataFrame

覃文锋,国内头部量化基金工程师,从事量化交易系统框架开发的有关工作。

Polars 是一个类似 Pandas 的 Python 的 DataFrame 库。一般在 Pandas 中只能使用单个 CPU 内核来执行操作。Polars 可以利用机器上多个 CPU ,自动并行执行用户定义的数据运算流。系统内置的查询优化器,可以减少了不必要的内存分配,减少重复运算。Polars 还支持用流式计算的方式,处理超过系统内存大小的超大数据集。

张先轶

从OpenBLAS到PerfXAPI异构计算框架

张先轶,博士,本科和硕士毕业于北京理工大学,博士毕业于中国科学院大学,曾于中科院软件所工作,之后分别在UT Austin和MIT进行博士后研究工作。国际知名开源矩阵计算项目OpenBLAS发起人和主要维护者。中国计算机学会高性能计算专业委员会委员,ACM SIGHPC China执行委员。2016年,创办PerfXLab澎峰科技,提供异构计算软件栈与解决方案。2016年获得中国计算机学会科学技术二等奖,2017年获得中国科学院杰出科技成就奖,2020年获得美国SIAM Activity Group on Supercomputing最佳论文奖。

随着AI和工业软件等领域的蓬勃发展,对计算的需求不断上升。除了芯片和服务器硬件,还需建设丰富的计算软件生态,才能支持计算应用良好且高效运行。本报告将结合开源矩阵计算库OpenBLAS经验,介绍异构计算软件框架PerfXAPI,可以使应用在多种异构计算硬件无缝迁移,让异构计算更便捷,更高效。展示PerfXAPI在视觉CNN模型,以及语言大模型的应用情况。

Rachel Hu

Build enterprise-grade LLMs on your private data

Rachel Hu is the Co-founder & CEO of CambioML (YC S23). Previously, she was an Applied Scientist and developed LLMs at AWS AI. She also co-authored open-source ML projects, including D2L.ai (100k+ MAU), and served as a senior speaker for AWS. She received her master's degree in Statistics from the University of California, Berkeley, and her Bachelor's degree in Mathematics from the University of Waterloo, Canada.

Are you interested in building your own enterprise-grade LLMs using your private data? In this talk, we will have a quick demo showing you how to train a customized AI agent on your private data via CambioML open-source toolkits. You will see how to use our open-source library, uniflow, to transform unstructured data (e.g., HTML, PDF, etc.) into structured question-answer (QA) pairs, then use our open-source library, pykoi, to finetune your own customized LLMs on the QA pairs with a few lines of Python code via RLHF.

赵家程

AI编译和传统编译结合挖掘芯片算力潜能

赵家程,博士,现任中国科学院计算技术研究所处理器芯片全国重点实验室副研究员、硕士生导师,计算所新百星。主要研究方向为面向领域定制架构的编译技术,包括针对GPU、NPU、DPU等的编程语言、编译系统和运行时系统。相关研究成果发表在OSDI、ASPLOS、HPCA、TOCS、TPDS、MLSYS,TACO、PACT、ICS等领域内国际期刊和会议上,主持了包括重点研发课题在内的多个项目,相关编译技术已经应用在寒武纪、华为昇腾等芯片上。

充分的发挥芯片的性能是编译器长久以来的追求,并在AI时代显得更加重要。本报告将汇报一系列结合AI编译和传统编译的优化技术,探索如何利用跨越多个层次的编译优化技术构建高效的针对AI应用的基础设施。

李伯勋

面向大模型的多维度软硬件协同优化

李伯勋,清华大学电子工程系本硕,曾任职于百度深度学习研究院,360,旷视等公司,现任无问芯穹智能科技有限公司算法总监,负责自然语言及多模态大模型的模型训练与优化。

人工智能的发展已经进入大模型时代,通用大模型已经在多个领域发挥作用。大模型的不断发展对算力产生了更多的需求。本次报告将介绍无问芯穹在大模型软硬件协同优化的最新进展,包括:1. 长文本训练与推理,从支持2k token的快速训练和推理到支持128k+ token的系统能力;2. 嵌入式优化,把语言或多模态生成模型部署到消费级显卡甚至手机等终端设备上;3. 一站式部署,高效部署工具链让大模型能够以低人力成本地部署到各种场景。

弋力

面向交互的四维视觉理解与生成

弋力博士,现任清华大学交叉信息研究院助理教授。他在斯坦福大学取得博士学位,导师为Leonidas J. Guibas教授,毕业后在谷歌研究院任研究科学家。在此之前,他在清华大学电子工程系取得了学士学位。他近期的研究兴趣涵盖三维视觉和具身人工智能,他的研究目标是使智能机器人具备理解三维世界并与之互动的能力。他在计算机视觉、计算机图形学以及机器学习领域的顶级会议发表论文五十余篇,并担任CVPR 2022、CVPR 2023、IJCAI 2023、NeurIPS 2023领域主席。他的工作在领域内得到广泛关注,引用数18000+,代表作品包括ShapeNet Part,光谱图CNN,PointNet++等。

过去几年里,静态三维场景感知取得了巨大进步。然而仅感知三维静态场景通常无法满足智能体与环境交互的需求。一方面,智能体需要对四维动态场景进行感知并与之交互,尤其当此类动态变化来自人与智能体的交互。另一方面,为了支持智能体更好的处理此类动态交互,也需要在虚拟世界中复现人与场景的动态交互。这些都对当今的三维计算机视觉系统提出了巨大挑战。在这次报告中,我将从四维数据集构建、四维深度视觉理解和面向交互的视觉合成这三个方面重点介绍课题组近期在智能体交互与感知方面所做的努力。

陈睿

Visual and Tactile Sim2Real^2 for Embodied AI

陈睿,博士,清华大学机械系助理研究员,主要研究方向为三维视觉、触觉感知、智能制造、智能机器人,于清华大学获得学士学位和博士学位,入选博士后创新人才支持计划和清华大学“水木学者”计划,承担国自然青年基金,在TRO/TPAMI/TMECH/CVPR/ICCV/ICRA等SCI期刊与国际会议发表学术论文二十余篇,已授权发明专利5项。

Visual sensing and tactile sensing are the two most important sensory methods for humans to percept and interact with the environment. Visual sensing provides global geometric and semantic information of the environment, while tactile sensing provides local geometric information and contact force information during interactions. How to effectively utilize these two different yet complementary sensory signals to enhance the interactive ability of intelligent robots in real environments is a cutting-edge problem for both academia and industry. This talk will introduce the speaker’s recent progress in this direction, which mainly includes: 3D sensing and sim2real; a bidirectional sim-real transfer method for vision-based tactile sensors; and a robot active manipulation framework by digital twinning.

许华哲

基础大模型:机器人操作的先验知识库

许华哲博士现为清华大学交叉信息研究院助理教授,博士后就读于斯坦福大学,博士毕业于加州大学伯克利分校。其研究领域是具身人工智能(Embodied AI)的理论、算法与应用,具体研究方向包括深度强化学习、机器人学、基于感知的控制(Sensorimotor)等。其科研围绕具身人工智能的关键环节,系统性地研究了视觉深度强化学习在决策中的理论、模仿学习中的算法设计和高维视觉预测中的模型和应用,对解决具身人工智能领域中数据效率低和泛化能力弱等核心问题做出多项贡献。其发表顶级会议论文四十余篇,代表性工作曾被MIT Tech Review,Stanford HAI等媒体报道。曾在IJCAI2023、ICRA2024担任领域主席/副主编。

这次报告将探讨如何在机器人操作中将视觉基础模型(visual foundation models)和语言模型(LLMs)进行整合。我们讨论了这些模型如何增强机器人操纵,例如生成视觉目标、寻找物体对应关系(correspondence),甚至生成模拟任务。通过利用提示调整和适应技术,我们发现这些基础模型中的先验知识对简化人类设计机器人任务的各个方面非常有帮助,进一步地提升了机器人的操作能力。

刘征瀛

AI4Math: 挑战与进展

刘征瀛,华为诺亚方舟实验室研究员,博士毕业于Université Paris-Saclay, 主要研究方向为 AI for Math 及大语言模型推理能力的理解与应用。

大语言模型(LLMs)在文本分类、机器翻译、文本摘要、常识问答等大多数自然语言处理的传统任务上已经达到了类人甚至超人的效果。然而,在符号推理、数学推理等任务上,LLMs 仍然具有较大的改进空间。我们将介绍近期我们在使用大语言模型进行符号推理任务比如定理证明、数学应用题求解的一些尝试,以及介绍我们提出的“硬化”的概念和相关的工作,有助于理解大语言模型的可解释性、提升推理效率及泛化性能。在定理证明问题上,我们提出的 LEGO-Prover 在证明定理的同时会不断添加引理来丰富定理库,在标准评测集 miniF2F 上达到远超 SotA 的水平(48% -> 57%)。我们还将结合 LEGO-Prover 探讨如何定义定理/工具的有用性以及如何提升所生成定理的复用性的问题。

贺笛

Towards Revealing the Mystery Behind Chain of Thought in LLMs

贺笛是北京大学的助理教授和博士生导师,曾在微软亚洲研究院担任高级研究员。他的研究兴趣包括自然语言处理、图神经网络、以及机器学习技术在科学探索中的应用。贺笛及其团队获得KDD 2021的分子属性预测挑战赛冠军和NeurIPS 2021的分子动力学模拟挑战赛冠军。他在机器学习领域的顶级会议上发表了几十篇论文,长期担任过顶级机器学习会议领域的区域主席,是2023年ICLR的杰出论文奖获得者。

最近的研究发现,思维链(Chain-of-Thought)提示(CoT)可以显著提高大型语言模型(LLMs)的性能,特别是在处理涉及数学或推理的复杂任务的时候。尽管经验上取得了巨大的成功,但CoT背后的机制以及它如何发挥LLMs的潜力仍然难以捉摸。在本文中,我们首次尝试在理论上回答这些问题。具体而言,我们研究了具有CoT的LLMs在解决基本数学和决策问题中的表达能力。我们首先给出了一个不可能性结果,表明有限深度的Transformer模型无法直接生成基本算术/方程任务的正确答案,除非模型大小相对于输入长度超多项式增长。相反,我们通过构造证明了具有恒定大小的自回归Transformer足以通过使用常用数学语言格式生成CoT推导来解决这两个任务。此外,我们展示了具有CoT的LLMs能够解决一类被称为动态规划的一般决策问题,从而证明了它在应对复杂的现实任务中的能力。最后,我们对四个任务进行了大量实验,结果显示,虽然Transformer模型在直接预测答案时总是失败的,但它们可以在提供足够的CoT演示的情况下一步一步地学会生成正确的解决方案。

刘勇

In-context Learning隐式更新机理研究

刘勇,中国人民大学,副教授,博士生导师。长期从事机器学习基础理论研究,共发表论文60余篇,其中以第一作者/通讯作者发表顶级期刊和会议论文近40篇,涵盖机器学习领域顶级期刊JMLR、IEEE TPAMI、Artificial Intelligence和顶级会议ICML、NeurIPS等。获中国人民大学“杰出学者”、中国科学院“青年创新促进会”成员、中国科学院信息工程研究所“引进优青”等称号。主持国家自然科学面上/基金青年、北京市面上项目、中科院基础前沿科学研究计划、腾讯犀牛鸟基金、CCF-华为胡杨林基金等项目。

预训练大语言模型表现出惊人的上下文学习能力(In-context Learning,ICL)。给定少数几个示例,模型在没有参数更新的情况下实现在新任务上表现出极好的学习性能,然而关于ICL的内在学习机理仍不清楚。将ICL的推理过程解释为一种对比学习模式下的隐式梯度更新过程,从对比学习的视角给出了ICL一种全新解释。此外,从对比学习的角度提出了几种改进原有ICL方法的思路。

卞伊琳

生成式AI - 消费者洞察的game changer:探索智能技术在市场分析中的应用

近20年的品牌管理和市场研究经验,着力于将其多年来在各种行业领域积累的丰富的小数据研究经验用于大数据的研究和运用,使大数据技术落地,真正为客户提供有价值,可操作的消费者洞察。和极速洞察的技术团队一起一手打造了SocialQuest,风铃问卷平台/风铃社区等一系列大小数据产品

随着生成式AI的迅速普及,它也为市场研究带来了更多可能性。在本次演讲中,我们将探讨生成式AI在消费者调研各环节的应用场景,并介绍极速洞察在结合生成式AI搭建业务模型、创新调研平台上的种种实践,一同探索生成式AI如何成为市场研究中的game changer,引领未来研究的新方向。

王梦佳

文档智能+行业大模型:助力企业数字化创新实践

阿里巴巴高级算法专家,2015年浙江大学硕士毕业后加入阿里,先后在阿里云城市大脑,数据中台和阿里企业智能团队工作。阿里云城市大脑初创团队成员,参与杭州城市大脑项目,阿里云多个数据中台和行业算法解决方案的设计与落地。目前负责文档智能技术和行业大模型在企业数字化场景的技术落地和应用。

1. 文档智能背景与技术挑战 2.文档智能技术方案 3. 面向企业数字化场景的行业大模型建设 4.文档智能+行业大模型在企业数字化场景的技术落地应用

冯俊晨

神奇的生产力提高是什么?以及如何找到它

芝加哥大学公共政策博士,互联网教育行业从业10年的数据科学家。在如何度量学习过程、外化学习效果,以及如何在劳动力密集型行业通过数据闭环提效这两个问题上有长期实践,并积累了一定的成功经验。目前负责火花思维的大数据团队和数据科学团队,主要研究兴趣在如何在教学过程(包括在职学习中)植入大语言模型和数据闭环。

如何定义GPT对于企业生产力的提高?如何进行度量?如何进行交付?以火花思维内部GPT聊天工具周活从员工数1%到8%的运营历程为案例,探讨如何将LLM变成产生10倍以上ROI业务价值的生产力工具。

姚凯

人工智能时代调研如何智能化

姚凯,中央财经大学商学院副教授,Credamo见数创始人,博士毕业于北大光华管理学院营销建模方向,美国宾大沃顿商学院联合培养博士,本科和硕士分别毕业于北师大和北大计算机系。研究领域包括:互联网营销、大数据营销、社会网络分析,论文发表于《营销科学学报》、《管理科学》、《Journal of Business Ethics》、《Enterprise Information Systems》等,曾主持和参与多个国家自然科学基金项目,同时负责市策大赛、心理精英赛两个全国性比赛的运营与执行工作。

在人工智能时代,新兴技术在不同应用场景得到了广泛应用。市场调研作为消费者洞察与管理决策的核心工作,人工智能与调研的深入结合将使调研效率得到飞速提升。同时,利用人工智能技术也将对调研行业发展和人才培养起到重要的影响。该报告将介绍如何利用人工智能技术提升市场调研效率,并且调研行业人才培养如何能够进一步提升质量。

殷磊

WealthGPT——让投资更简单

殷磊博士现任微众银行资深室经理,智能资管负责人,负责基于AI与另类数据驱动的资产管理和风险管理。殷磊博士带领团队获得2021年彭博量化大赛特等奖;同时,其带领研发的新一代ESG投资平台“揽月”荣获《哈佛商业评论》中国新增长创新实践榜单,2020年证券期货业金融科技研究发展中心(深圳)2020年度研究课题一等奖。殷磊博士之前曾任百度慧眼负责人,去哪网技术总监,简普科技风控技术负责人。

投资是一项具有挑战性的任务。然而,随着科技的飞速发展,我们有了更多机会利用最新的技术来改进投资决策过程。本次演讲将介绍一个为投资而生的大模型WealthGPT,以及其在宏观市场判断、板块轮动分析、个性化投资建议层面的应用。同时,演讲还会剖析WealthGPT如何帮助机构构建ESG评级体系,促进社会责任投资和绿色主题基金的发展。

张丹

面向落地的数据分析:让“可能”变成了“可行”

张丹,R语言实践者,北京青萌数海科技有限公司CTO,微软MVP。

10年以上互联网应用架构经验,在R、大数据、数据分析等方面有深厚的积累。精通量化投资交易策略,熟悉中国金融二级市场、交易规则和投研体系。 熟悉数据学科方法论,在海关、药监、外汇等监管科技领域均有落地项目。

著有《R的极客理想:量化投资篇》、《R的极客理想:工具篇》、《R的极客理想:高级开发篇》,图书英文版被CRC出版集团引进,在美国发行。个人博客:http://fens.me 。

现在我们正处于大数据时代,处处都产生数据,大部分数据已经不在稀缺,分析方法和算法模型都也写在了教课书中。如何挖掘出数据的价值,让数据分析落地,把数据价值转换为自身价值,是数据分析师核心要考虑的。

数据分析要解决实际业务场景问题,伪需求、不清晰的目标,都会造成项目失败。数据分析不只是指标体系、更不是指标堆积,市场在变,数据也在变,我们的知识结构也要跟着变化。数据分析是跨学科的工作,对人的要求也越来越高,调包侠的时代已过。要以新的视角,看数据、看业务、看技术发展、看我们自己,适应变化,才能把项目做好、落地。

以我多年的对数据分析的实际经验,借助R语言让“可能”变成“可行”。

高天辰

基于神经网络的统计文献引用数预测

高天辰,厦门大学数理统计博士在读,研究兴趣包括:复杂网络数据的统计建模,大数据营销,进化博弈论与生物统计。在《Expert Systems with Applications》、《Journal of the Royal Statistical Society Series C Applied Statistics》,《Statistics and Its Interface》、《经济管理学刊》等期刊发表或被接受论文5篇,参加编著《网络结构数据分析与应用》等教材。

Citation counts is a crucial factor in evaluating the quality of research papers. Therefore, it is vital to accurately predict citation counts and explore the mechanisms underlying citations. In this study, we focus on predicting the citation counts in the field of statistics. We collect 55,024 academic papers published in 43 statistics journals between 2001 and 2018. Furthermore, we collect and clean a high-quality dataset and then construct multi-layer networks from different perspectives, including journal networks, author citation networks, co-citation networks, co-authorship networks, and keyword co-occurrence networks. Additionally, we extract 77 factors for citation counts prediction, including 22 traditional and 55 network-related factors. To address the issues of zero-inflated and over-dispersed citation counts, a neural network model is designed to achieve high prediction accuracy. Furthermore, we adopt a leave-one-feature-out approach to investigate the importance of these factors. The proposed neural network model achieves an MAE value of 7.352, which outperforms other machine learning models in the comparison. Thus, this study provides a useful guide for researchers to predict citation counts and can be easily extended to other research fields.

王子涵

Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback

Zihan Wang is an undergraduate student from GSAI, Renmin University of China, where he is currently advised by Prof. Zhicheng Dou. He works closely with Prof. Heng Ji from UIUC and Dr. Weiyan Shi from Stanford University. His research interest mainly lies in augmented language models, including (1) general language model interaction (2) the cross application of language models and information retrieval (IR) systems.

Current LLM evaluations focus on single-turn and overlook multi-turn real-world scenarios. We introduce the MINT benchmark to assess LLMs in interaction with tools and language feedback. Our study of 20 LLMs shows they benefit from multi-turn interactions, but current RLHF and SIFT methods might hinder this. MINT aims to encourage research on LLM multi-turn capabilities, especially in open-source models.

袁立凡

Boosting Language Models with High-quality Feedback

Lifan Yuan is a member of THUNLP, advised by Prof. Zhiyuan Liu. Currently, he is also a research intern at Blender NLP, UIUC, working with Prof. Heng Ji and Prof. Hao Peng. His research interests mainly lie in building trustworthy NLP systems, and he is also interested in enhancing LM agents through internal alignment (e.g., RLHF) and external interaction (with tools/feedback).

Reinforcement learning from human feedback (RLHF) has become a pivot technique in aligning large language models (LLMs) with human preferences. However, the scarcity of diverse, naturalistic datasets of human preferences on LLM outputs at scale poses a great challenge to RLHF as well as feedback learning research within the open-source community. This work investigates how high-quality feedback data can enhance LLMs.

刘泽一

基于强化学习的无人机空战决策

刘泽一,中国航空研究院,博士毕业于国防科技大学,主要研究方向为复杂网络分析与多智能体深度强化学习。

作为未来战场的主要组成部分,无人机急需具备自主空战决策能力。基于规则的空战决策算法往往无法适应复杂的战场环境,强化学习技术可以克服该问题并根据实时态势进行空战决策。针对近距空战自主决策问题,提出了基于SAC算法的无人机自主空战决策方法。采用六自由度飞行器模型并考虑导弹等空战要素的影响构建仿真环境,以提高问题的真实性。根据敌我飞行器的角度、位置、速度等关系设计奖励函数,并通过自博弈的方法训练智能体。仿真结果表明,该方法能够实现无人机自主空战决策的目标,提高了无人机的自主决策能力。

肖一凡

自动驾驶的安全体系设计

肖一凡,小马智行科技有限公司,自动驾驶算法工程师。硕士毕业于清华大学计算机系,毕业后加入华为技术有限公司。
曾任华为手机产品线语音助手交付负责人、华为北京研究院AutoML高级算法工程师。2021年加入小马智行,从事自动驾驶研发工作。
研究方向包括感知算法、规控、GPU加速、AutoML算法等,同时对算法的性能优化、部署、线上运营有丰富经验。

自动驾驶的安全性至今尚未被大众认同,安全也成为自动驾驶商业化路上最大的难点。那么当前业界是如何保障自动驾驶安全性的?未来自动驾驶能做到何种的安全?本报告会围绕这些问题展开讨论,欢迎大家参会一同探讨。

杨昆

面向无人驾驶出行服务的产品设计

杨昆,百度阿波罗智能有限公司,无人驾驶产品经理。硕士毕业于美国哥伦比亚大学,毕业加入商汤科技,参与AI+交通、AI+零售行业产品落地。2021年加入百度,从事自动驾驶产品设计工作。

从用户需求的角度阐述无人驾驶产品的定位和竞争优势。探讨新的AI能力对传统出行行业的颠覆,以及围绕新的技术如何进行产品设计

魏佳辉

创新型自动驾驶研发中的工程问题

魏佳辉,图森未来资深软件研发工程师,目前作为车载软件方向下高精地图部门负责人主导图森高精地图的全流程的服务和工具的开发工作。硕士毕业于美国加州大学圣克鲁斯分校(UCSC),作为早期员工在图森北美实习,毕业之后回国加入图森中国。

自动驾驶作为一个全新的领域很多问题尚未被解决,针对这种场景,如何设计中间件与工具链更好的支持算法的迭代和测试是一个复杂的问题。本报告会结合图森的一些做法介绍一种思路,与大家一起讨论。

王菲菲

Social Behavior Analysis in Exclusive Enterprise Social Networks by FastHAND

王菲菲,中国人民大学统计学院副教授,北京大学光华管理学院统计学博士。研究上关注文本挖掘及其商业应用、社交网络分析、大数据建模等,研究论文发表于Journal of Econometric, Journal of Business and Econometric Statistics, Journal of Machine Learning Research, 中国科学(数学)等国内外高水平期刊上。主持并参与了国家自科基金项目、教育部社科重大项目、国家重点研发项目等多个课题。曾获中国人民大学优秀科研成果奖、课外教学优秀奖等。

There is an emerging trend in the Chinese automobile industries that automakers are introducing exclusive enterprise social networks (EESNs) to expand sales and provide after-sale services. The traditional online social networks (OSNs) and enterprise social networks (ESNs), such as Twitter and Yammer, are ingeniously designed to facilitate unregulated communications among equal individuals. However, users in EESNs are naturally social stratified, consisting of both enterprise staffs and customers. In addition, the motivation to operate EESNs can be quite complicated, including providing customer services and facilitating communication among enterprise staffs. As a result, the social behaviors in EESNs can be quite different from those in OSNs and ESNs. In this work, we aim to analyze the characteristics of social patterns in EESNs and study the driving forces of social link formation by formulating it as a link prediction problem in heterogeneous social networks. In a typical EESN provided by the Chinese car manufacturer NIO, we derive plentiful user features, build multiple meta-path graphs, and develop a novel Fast (H)eterogeneous graph (A)ttention (N)etwork algorithm for (D)irected graphs (FastHAND) to predict directed social links among users. The algorithm introduces feature group attentions in node-level and uses edge sampling algorithm over directed meta-path graphs to reduce the computation cost. Experimental results demonstrate the predictive power of our proposed method and our intuitions about social affinity propagation in EESNs.

周峰

Integration-free Training for Spatio-temporal Multimodal Covariate Deep Kernel Point Processes

周峰,中国人民大学统计学院讲师,中国人民大学杰出青年学者。主要从事统计机器学习、贝叶斯方法、随机过程等。现已在Journal of Machine Learning Research、Statistics and Computing、Machine Learning、ICLR、NeurIPS等期刊会议发表论文20余篇。主持国家自然科学基金青年项目,中国博士后基金特别资助、面上资助,入选博士后国际交流计划引进项目,中国人民大学科研国际合作支持项目星火计划。

In this study, we propose a novel deep spatio-temporal point process model, Deep Kernel Mixture Point Processes (DKMPP), that incorporates multimodal covariate information. DKMPP is an enhanced version of Deep Mixture Point Processes (DMPP), which uses a more flexible deep kernel to model complex relationships between events and covariate data, improving the model's expressiveness. To address the intractable training procedure of DKMPP due to the non-integrable deep kernel, we utilize an integration-free method based on score matching, and further improve efficiency by adopting a scalable denoising score matching method. Our experiments demonstrate that DKMPP and its corresponding score-based estimators outperform baseline models, showcasing the advantages of incorporating covariate information, utilizing a deep kernel, and employing score-based estimators.

刘越

Trustworthy Policy Learning under the Counterfactual No-Harm Criterion

刘越,中国人民大学讲师,2019年博士毕业于北京大学。多篇文章发表于Journal of Machine Learning Research(JMLR), Artificial Intelligence(AIJ), IEEE Transactions on Knowledge and Data Engineering(TKDE), IEEE Transactions on Neural Networks and Learning Systems(TNNLS), SIGKDD, International Conference on Machine Learning(ICML),The Conference on Uncertainty in Artificial Intelligence(UAI)等机器学习与统计学期刊及会议。
研究兴趣主要包括因果推断,贝叶斯网络以及基于因果推断的机器学习算法等。

Trustworthy policy learning has significant importance in making reliable and harmless treatment decisions for individuals. Previous policy learning approaches aim at the well-being of subgroups by maximizing the utility function (e.g., conditional average causal effects, post-view click-through\&conversion rate in recommendations), however, individual-level counterfactual no-harm criterion has rarely been discussed. In this paper, we first formalize the counterfactual no-harm criterion for policy learning from a principal stratification perspective. Next, we propose a novel upper bound for the fraction negatively affected by the policy and show the consistency and asymptotic normality of the estimator. Based on the estimators for the policy utility and harm upper bounds, we further propose a policy learning approach that satisfies the counterfactual no-harm criterion, and prove its consistency to the optimal policy reward for parametric and non-parametric policy classes, respectively. Extensive experiments are conducted to show the effectiveness of the proposed policy learning approach for satisfying the counterfactual no-harm criterion.

蔡智博

Empowering Collaborative Filtering with Principled Adversarial Contrastive Loss

蔡智博,男,现任中国人民大学统计学院数据科学与大数据统计系讲师。主要研究兴趣包括充分降维、变量选择及其在机器学习中的应用等。学术论文在JASA,ICLR,NeurIPS等期刊会议上发表。

Contrastive Learning (CL) has achieved impressive performance in self-supervised learning tasks, showing superior generalization ability. Inspired by the success, adopting CL into collaborative filtering (CF) is prevailing in semi-supervised top-K recommendations. The basic idea is to routinely conduct heuristic-based data augmentation and apply contrastive losses (e.g. InfoNCE) on the augmented views. Yet, some CF-tailored challenges make this adoption suboptimal, such as the issue of out-of-distribution, the risk of false negatives, and the nature of top-K evaluation. They necessitate the CL-based CF scheme to focus more on mining hard negatives and distinguishing false negatives from the vast unlabeled user-item interactions, for informative contrast signals. To bridge the gap, we delve into the reasons underpinning the success of contrastive loss in CF, and propose a principled Adversarial InfoNCE loss (AdvInfoNCE), which is a variant of InfoNCE, specially tailored for CF methods. AdvInfoNCE adaptively explores and assigns hardness to each negative instance in an adversarial fashion and further utilizes a fine-grained hardness-aware ranking criterion to empower the recommender's generalization ability. Training CF models with AdvInfoNCE, we validate the effectiveness of AdvInfoNCE on both synthetic and real-world benchmark datasets, thus showing its generalization ability to mitigate out-of-distribution problems.

Rong Ma

A Spectral Method for Assessing and Combining Multiple Data Visualizations

Rong Ma is an Assistant Professor of Biostatistics at Harvard University. He got his PhD in biostatistics from the University of Pennsylvania in 2021, where he was jointly advised by T. Tony Cai and Hongzhe Li. After that, he was a postdoctoral scholar in statistics at Stanford University, advised by David Donoho. Dr. Ma's research interest lies in the intersection of statistics and biomedical data science. He currently focuses on (i) statistical inference for large random matrices and high-dimensional models, (ii) theoretical and computational underpinning of modern nonlinear embedding techniques and manifold learning algorithms, and (iii) developing principled and interpretable machine learning methods for single-cell omics, microbiomics, and immunology.

Dimension reduction is an indispensable part of modern data science, and many algorithms have been developed. However, different algorithms have their own strengths and weaknesses, making it important to evaluate their relative performance, and to leverage and combine their individual strengths. This paper proposes a spectral method for assessing and combining multiple visualizations of a given dataset produced by diverse algorithms. The proposed method provides a quantitative measure -- the visualization eigenscore -- of the relative performance of the visualizations for preserving the structure around each data point. It also generates a consensus visualization, having improved quality over individual visualizations in capturing the underlying structure. Our approach is flexible and works as a wrapper around any visualizations. We analyze multiple real-world datasets to demonstrate the effectiveness of the method. We also provide theoretical justifications based on a general statistical framework, yielding several fundamental principles along with practical guidance.

Jingshu Wang

Joint Trajectory Inference for Single-cell Genomics Using Deep Learning with a Mixture Prior

Jingshu Wang is currently an assistant professor at the Department of Statistics, the University of Chicago. Her main research interest is in developing statistical methods for cutting-edge bio-technologies and genetic problems. She currently works on problems in single-cell omics, Mendelian Randomization and structural variation in the 3D genome. Her research also includes developing general statistical methodology in causal inference and hypotheses testing that arise from new challenges in genetics and public health.

Trajectory inference methods are essential for analyzing the developmental paths of cells in single-cell sequencing datasets. It provides insights into cellular differentiation, transitions, and lineage hierarchies, helping unravel the dynamic processes underlying development, disease progression, and tissue regeneration. However, many existing tools for trajectory inference lack a cohesive statistical model and reliable uncertainty quantification, limiting their utility and robustness. In this talk, I will introduce VITAE (Variational Inference for Trajectory by AutoEncoder), a novel statistical approach that integrates a latent hierarchical mixture model with variational autoencoders to infer trajectories. Notably, VITAE uniquely enables simultaneous trajectory inference and data integration across multiple datasets, enhancing accuracy in both tasks. I will show that VITAE outperforms other state-of-the-art trajectory inference methods on both real and synthetic data under various trajectory topologies. For case studies, I will apply VITAE to jointly analyze three distinct single-cell RNA sequencing datasets of the mouse neocortex, unveiling comprehensive developmental lineages of projection neurons. VITAE effectively mitigates batch effects within and across datasets, aligning cells to elucidate clear trajectories and uncover finer structures that might be overlooked in individual datasets. If time permits, I will also showcase VITAE’s efficacy in integrative analyses of multi-omic datasets with continuous cell population structures.

Zhe Chen

The role of randomization inference in unraveling individual treatment effect profiles in clinical trials

Zhe Chen is a PhD candidate in the Department of Statistics at the University of Illinois Urbana-Champaign, mentored by Professor Xinran Li. Her research interest lies in developing novel statistical methodologies for causal inference, with particular focuses on randomization-based inference, instrumental variable, statistical matching and their applications in health and medicine.

Randomization inference is a powerful tool in early phase vaccine trials to estimate the causal effect of a regimen against a placebo or another regimen. In this talk, we will first review randomization-based inference for testing two commonly used causal null hypothesis: Fisher’s sharp null hypothesis of no treatment effect for any unit and Neyman’s weak null hypothesis of no sample average treatment effect. We then introduce a new class of causal estimands: quantiles of individual treatment effects (ITEs). Compared to the conventional focus on average treatment effects, the ITE quantiles are more robust to extreme individual treatment effects and can better characterize the heterogeneous treatment effect profiles across all individuals in the population of interest. We derive two randomization-based, finite-sample valid inferential methods for ITE quantiles and show how to conduct valid simultaneous inference for distribution functions of ITEs. Next, we will discuss how to extend some existing methods to incorporating useful auxiliary information, e.g., the assay-specific limit of detection, that is often available in a vaccine trial. In a simulation study, we compare various methods for inferring ITE quantiles under many practical data-generating processes. Lastly, we apply various methods to an early-phase clinical trial, HIV Vaccine Trials Network Study 086 (HVTN 086), to showcase the usefulness of these methods in unraveling ITE profiles in clinical trials.

赵浩宇

Do Transformers Parse while Predicting the Masked Word?

Haoyu Zhao is a graduate student at Princeton University, advised by Prof. Sanjeev Arora. He is interested in general machine learning and natural language processing, especially the intersection between deep learning theory and NLP. Prior to Princeton, he graduated from Tsinghua University in 2016.

Pre-trained language models have been shown to encode linguistic structures, e.g. dependency and constituency parse trees, in their embeddings while being trained on unsupervised loss functions like masked language modeling. Some doubts have been raised whether the models actually are doing parsing or only some computation weakly correlated with it. We study questions: (a) Is it possible to explicitly describe transformers with realistic embedding dimension, number of heads, etc. that are capable of doing parsing -- or even approximate parsing? (b) Why do pre-trained models capture parsing structure? In this talk, we take a step toward answering these questions in the context of generative modeling with PCFGs. We show that masked language models like BERT or RoBERTa of moderate sizes can approximately execute the Inside-Outside algorithm for the English PCFG [Marcus et al, 1993]. We also show that the Inside-Outside algorithm is optimal for masked language modeling loss on the PCFG-generated data. We also give a construction of transformers with 50 layers, 15 attention heads, and 1275 dimensional embeddings in average such that using its embeddings it is possible to do constituency parsing with >70% F1 score on PTB dataset. We conduct probing experiments on models pre-trained on PCFG-generated data to show that this not only allows recovery of approximate parse tree, but also recovers marginal span probabilities computed by the Inside-Outside algorithm, which suggests an implicit bias of masked language modeling towards this algorithm.

马梓业

The role of over-parametrization in non-convex optimization: A case study of matrix sensing

Ziye Ma is a fifth year PhD student at EECS, UC Berkeley. He is supervised by Somayeh Sojoudi and Javad Lavaei. His research interests broadly lie in the field of mathmatical optimization, machine learning, and nonconvexity.

This talk concerns the role of over-parametrization in solving non-convex optimization problems. The focus is on the important class of low-rank matrix sensing, where the authors propose an infinite hierarchy of non-convex problems via the lifting technique and the Burer-Monteiro factorization. This contrasts with the existing over-parametrization technique where the search rank is limited by the dimension of the matrix and it does not allow a rich over-parametrization of an arbitrary degree. It has been shown that by using the gradient descent algorithm with small initialization, spurious solutions of the problem can be transformed into strict saddle points (under some technical conditions). This is the first result in the literature showing that over-parametrization creates a negative curvature for escaping spurious solutions. As a corollary, the authors also show that implicit regularization can be expected in tensor optimization problems, underlying the importantce of a suitable optimization algorithm in the context of an over-parametrized search space.

周墨

Implicit Regularization Leads to Benign Overfitting for Sparse Linear Regression

Mo Zhou is currently a Ph.D. student in the Computer Science Department at Duke University, advised by Prof. Rong Ge. Before coming to Duke, he received B.S. in Statistics from Yuanpei College at Peking University in 2019. His research interests are in optimization and theoretical machine learning. Recently, he is interested in deep learning theory, especially from an optimization perspective.

In deep learning, often the training process finds an interpolator (a solution with 0 training loss), but the test loss is still low. This phenomenon, known as benign overfitting, is a major mystery that received a lot of recent attention. One common mechanism for benign overfitting is implicit regularization, where the training process leads to additional properties for the interpolator, often characterized by minimizing certain norms. However, even for a simple sparse linear regression problem, neither minimum ℓ1 or ℓ2 norm interpolator gives the optimal test loss. In this work, we give a different parametrization of the model which leads to a new implicit regularization effect that combines the benefit of ℓ1 and ℓ2 interpolators. We show that training our new model via gradient descent leads to an interpolator with near-optimal test loss. Our result is based on careful analysis of the training dynamics and provides another example of implicit regularization effect that goes beyond norm minimization.

李黎明

 {mmrm}:a robust and comprehensive R package for implementing mixed models for repeated measures

Liming Li is senior data scientist at Roche since 2019. He is now the technical engineering lead of NEST project at Roche, and also member of mmrm taskforce of Software Engineering Working Group (SWE WG) under the American Statistical Association (ASA) Biopharmaceutical Section (BIOP). Liming has a master degree in biostatistics from Fudan University.

Mixed models for repeated measures (MMRM) analysis has been extensively used in the pharmaceutical industry to analyze longitudinal datasets. SAS has been the gold standard for this analysis in the past, and so far R packages fall short for one of the following reasons: model convergence issues, unavailability of covariance structures or adjusted degrees of freedom, or numerical results being far from SAS. To fill in this important gap in the open-source statistical software landscape, cross-company collaboration via the “Software Engineering Workgroup” (SWE WG) has been initiated and developed the new {mmrm} R package. A critical advantage of {mmrm} over existing implementations is that it is faster and converges more reliably. It also provides a comprehensive set of features: users can specify a variety of covariance matrices, weight observations, fit models with restricted or standard maximum likelihood inference, perform hypothesis testing with Satterthwaite or Kenward-Roger adjusted degrees of freedom, extract the least square means estimates using the emmeans package, and use tidymodels for easy model fitting. We aim to establish {mmrm} as a new standard for fitting MMRM.

Joe Zhu

Clinical reporting with NEST packages

Joe Zhu is the lead software engineer of the NEST project at Roche. He has a PhD in statistics. Before joining Roche, Joe took two postdoc research positions at the University of Oxford, with a research focus on statistical genomics. He is an open source software advocate and developer; more details can be found at http://www.github.com/shajoezhu

New data types require new tools and methodologies and there is a need in biometrics to respond quickly to understand the ever increasing growth of data generated from clinical trials. In recent years, analysis software tools have taken huge leaps forward. Therefore, we have seized the business opportunity to build an R-based toolkit that will allow us to meet these challenges. So far, we have open sourced several of our key packages on Github and the CRAN. Many of our packages are well received by users. I would like to share some of our most recent updates including new open-source packages and tlg-catalog.

陆猛

R在药物研发中的数据可视化应用

2020年9月于美国亚利桑那大学获得统计学博士学位,2021年1月至今在默沙东生物统计和科学决策部(BARDS)担任高级统计师,参与肿瘤,疫苗研发项目。

目前药物递交中表格列表图谱的生成主要还是利用SAS进行,但默沙东一直致力于药物研发过程中的R包和工具的开发,希望可以将其应用到未来的药品递交中。本演讲将主要集中在本司开发的几个数据可视化R包的介绍上。

张文娟

R在药物研发中的数据可视化应用

2021年6月于美国科罗拉多大学丹佛分校获得应用数学博士学位,2021年8月至今在默沙东生物统计和科学决策部(BARDS)担任高级统计师,参与普药研发项目。

目前药物递交中表格列表图谱的生成主要还是利用SAS进行,但默沙东一直致力于药物研发过程中的R包和工具的开发,希望可以将其应用到未来的药品递交中。本演讲将主要集中在本司开发的几个数据可视化R包的介绍上。

戴奔

ReHLine: Regularized Composite ReLU-ReHU Loss Minimization with Linear Computation and Linear Convergence

戴奔是香港中文大学统计系的助理教授。 主要研究领域包括机器学习、学习理论、统计推断和统计计算。 同时对开发统计和机器学习软件感兴趣。

Empirical risk minimization (ERM) is a crucial framework that offers a general approach to handling a broad range of machine learning tasks. In this paper, we propose a novel algorithm, called ReHLine, for minimizing a set of regularized ERMs with convex piecewise linear-quadratic loss functions and optional linear constraints. The proposed algorithm can effectively handle diverse combinations of loss functions, regularizations, and constraints, making it particularly well-suited for complex domain-specific problems. Examples of such problems include FairSVM, elastic net regularized quantile regression, Huber minimization, etc.
In addition, ReHLine enjoys a provable linear convergence rate and exhibits a per-iteration computational complexity that scales linearly with the sample size. The algorithm is implemented with both Python and R interfaces, and its performance is benchmarked on various tasks and datasets. Our experimental results demonstrate that ReHLine significantly surpasses generic optimization solvers in terms of computational efficiency on large-scale datasets. Moreover, it also outperforms specialized solvers such as liblinear in SVM, hqreg in Huber minimization and lightning(SAGA, SAG, SDCA, SVRG) in smooth SVM, exhibiting exceptional flexibility and efficiency.

任明旸

TransGraph: an R package for transfer graph learning

任明旸现为香港中文大学统计系博士后,博士毕业于中国科学院大学数学科学学院。主要研究方向为迁移学习、亚组分析、图模型、高维数据分析和生物统计。

Transfer learning, aiming to use auxiliary domains to help improve learning of the target domain of interest when multiple heterogeneous datasets are available, has been a hot topic in statistical machine learning. The recent transfer learning methods with statistical guarantees mainly focus on the overall parameter transfer for supervised models in the ideal case with the informative auxiliary domains with overall similarity. In contrast, transfer learning for unsupervised graph learning is in its infancy and largely follows the idea of overall parameter transfer as for supervised learning, not to mention the absence of a comprehensive statistical software. In this talk, I will introduce the newly released R package, TransGraph, and our recent series of relevant research on transfer learning for several complex graphical models, including Tensor Gaussian graphical models, non-Gaussian directed acyclic graph (DAG), and Gaussian graphical mixture models (GGMMs). Notably, this package promotes local transfer at node-level and subgroup-level in DAG structural learning and GGMMs, respectively, which are more flexible and robust than the existing overall parameter transfer. As by-products, transfer learning for undirected graphical model (precision matrix) via D-trace loss, transfer learning for mean vector estimation, and single non-Gaussian DAG learning via topological layer method are also included in this package.

汪利军

earth.dof.patch: an R package for Correcting Degrees of Freedom of Multivariate Adaptive Regression Splines

汪利军现为耶鲁大学生物统计系博士后,主要研究方向为统计学习和生物统计。

Model degrees of freedom (df) is a fundamental concept in statistics because it quantifies the flexibility of a fitting procedure and is indispensable in model selection. The df is often intuitively equated with the number of independent variables in the fitting procedure. But for adaptive regressions that perform variable selection (e.g., the best subset regressions), the model df is larger than the number of selected variables. The excess part has been defined as the search degrees of freedom (sdf) to account for the search cost in model selection. For some complex procedures such as the multivariate adaptive regression splines (MARS), the search cost needs to be pre-determined to serve as a tuning parameter for the procedure itself, but it might be inaccurate. To investigate the inaccurate pre-determined search cost, we introduce two novel concepts, nominal df and actual df, and formulate a property named “self-consistency”, which denotes the absence of a gap between nominal df and actual df. We propose a correcting procedure tailored for MARS, which is shown to improve the fitting performance based on extensive simulation studies.

Linjun Zhang

Fair conformal prediction and risk control

Linjun Zhang is an Assistant Professor in the Department of Statistics, at Rutgers University. He obtained his Ph.D. in Statistics at the Wharton School, the University of Pennsylvania in 2019, and received J.Parker Bursk Memorial Prize and Donald S. Murray Prize for excellence in research and teaching, respectively upon graduation. His current research interests include algorithmic fairness, privacy-preserving data analysis, deep learning theory, and high-dimensional statistics.

Multi-calibration is a powerful and evolving concept originating in the field of algorithmic fairness. For a predictor $f$ that estimates the outcome y given covariates $x$, and for a function class $C$, multi-calibration requires that the predictor $f(x)$ and outcome y are indistinguishable under the class of auditors in $C$. Fairness is captured by incorporating demographic subgroups into the class of functions $C$. Recent work has shown that, by enriching the class $C$ to incorporate appropriate propensity re-weighting functions, multi-calibration also yields target-independent learning, wherein a model trained on a source domain performs well on unseen, future target domains(approximately) captured by the re-weightings. The multi-calibration notion is extended, and the power of an enriched class of mappings is explored. HappyMap, a generalization of multi-calibration, is proposed, which yields a wide range of new applications, including a new fairness notion for uncertainty quantification (conformal prediction), a novel technique for conformal prediction under covariate shift, and a different approach for fair risk control, while also yielding a unified understanding of several existing seemingly disparate algorithmic fairness notions and target-independent learning approaches. A single HappyMap meta-algorithm is given that captures all these results, together with a sufficiency condition for its success.

Ruoxi Jia

Data Intelligence in Machine Learning: A New Pathway Towards Responsible AI

Ruoxi Jia is an assistant professor in the the Bradley Department of Electrical and Computer Engineering at Virginia Tech. She earned her PhD in the EECS Department from UC Berkeley and a B.S. from Peking University. Jia's research interest lies broadly in the span of machine learning, security, privacy, and cyber-physical systems. Jia's recent work focuses on data-centric and trustworthy machine learning. Ruoxi is the recipient of the NSF CAREER Award, the Chiang Fellowship for Graduate Scholars in Manufacturing and Engineering, the 8108 Alumni Fellowship, and the Okamatsu Fellowship, Virginia’s Commonwealth Cyber Initiative award, Cisco Research Awards, and Amazon-VT Initiative Research Awards. She was selected for the Rising Stars in EECS in 2017. Ruoxi’s work has been featured in multiple media outlets such as New York Times, MIT Technology Review, IEEE Spectrum, and Wired.

The pivotal role of large datasets in propelling the advancements of modern machine learning applications such as computer vision, natural language processing, and multi-modal learning cannot be overstated. While machine learning algorithms are often indiscriminate aggregators of given data sources, resulting in 'bad data leading to bad outcomes,' the central theme of our research advocates for a more deliberate approach. This involves understanding how data is transduced within a machine learning system, its impact on outcomes, and how one can actively select data for creating an efficient and robust machine learning solution. In pursuit of this goal, we will present a series of our recent works regarding principled frameworks for data sourcing, data selection, and data-based model behavior attribution.

Zhanrui Cai

Private Estimation and Inference in HighDimensional Regression with FDR Control

蔡占锐现为香港大学经管学院助理教授,研究兴趣包括differential privacy, distribution-free inference 等。之前他曾是爱荷华州立大学统计系助理教授,并于卡耐基梅隆大学作博士后研究员。他于宾夕法尼亚州立大学获得博士学位。

In this paper, we presents novel methodologies for conducting practical differentially private (DP) estimation and inference in high-dimensional linear regression. We start by proposing a differentially private Bayesian Information Criterion (BIC) for selecting the unknown sparsity parameter in DP-Lasso, eliminating the need for prior knowledge of model sparsity, a requisite in the existing literature. Then we propose a differentially private debiased LASSO algorithm that enables privacy-preserving inference on regression parameters. Our proposed method enables accurate and private inference on the regression parameters by leveraging the inherent sparsity of highdimensional linear regression models. Additionally, we address the issue of multiple testing in high-dimensional linear regression by introducing a differentially private multiple testing procedure that controls the false discovery rate (FDR). This allows for accurate and privacy-preserving identification of significant predictors in the regression model. Through extensive simulations and real data analysis, we demonstrate the efficacy of our proposed methods in conducting inference for high-dimensional linear models while safeguarding privacy and controlling the FDR.

Xiaohu Zhu

通用人工智能时代的人类未来

Uplifts Life、Center for Safe AGI和University AI创始人,中国首位通用人工智能安全研究员,谷歌机器学习GDE,Future Forum ’22 中国唯一入选人,Foresight Institute Fellow in Safe AGI、 中国伦理学会技术伦理分会成员。与Foresight Institute、DeepMind、OpenAI、Center for Human-Compatible AI、Future of Humanity Institute 和 Future of Life Institute 长期交流合作。《深入浅出神经网络与深度学习》、《人工智能缔造师》、《人工智能对齐通讯》和《人工智能蓝图》译者。《通用人工智能安全和治理》作者。

为了确保通用人工智能成为一种释放人类潜能的工具,而不是束缚甚至掌控人类未来的不可控技术,我们必须以开放的心态和坚定的决心面对这个时代,理解挑战,拥抱可能性,并始终努力确保通用人工智能服务于增强,而不是取代,我们共享的人类体验。

张博伦

LLM能给主题模型编码吗?从大语言模型反观小模型的使用

张博伦,UCSD社会学系博士候选人。研究方向为经济社会学和政治社会学,关注算法系统的社会构成,以及信息资本主义的跨国差别。

ChatGPT是由OpenAI开发的基于大语言模型的通用应用,可以用来完成自然语言处理方面的任务。社会科学的研究实践涉及大量自然语言处理的任务,因而ChatGPT可能有广阔的应用前景,但社会科学家往往处理的文本规模较小,主题也相对单一。在这种情况下,在海量文本上预训练的大模型应该如何应用到规模较小的文本上?对我们反思在这些文本上训练的小模型的使用又有什么启示?

本研究以主题模型为例,探讨ChatGPT能否为主题模型的结果生成可信的标签。我们抽取了发表在中外社会学期刊上使用了主题模型的论文,并将其主题与标签打乱顺序,通过在中外网络平台进行问卷调查,由一般用户评价二者谁更可能反映了原文的主题。

结果表明,一般用户对原论文给出的标签评价并没有显著高于对ChatGPT给出的标签,甚至在多数主题上对ChatGPT标签评价更高的比例更高。这说明ChatGPT可能可以被用于给主题模型的结果生成标签,给研究者加以评判。ChatGPT给出标签更受欢迎,也指出了现在研究者在使用主题模型中的一些缺陷。不过这也意味着缺乏领域专门知识的ChatGPT表现在有些条件下可能可以达到专业研究者的水平,为现行的社会科学学术研究带来了新的挑战。

李林倬

使用LLM探索语言中的新颖性和科学文章影响力的关系

李林倬,浙江大学社会学系助理教授。研究方向为知识生产和创新。

科学的前进和创新依赖科学社区对新颖知识的探索。过去对于知识新颖性的考察主要基于一些外在的编码体系下知识组合方式的测量,或是基于文本中是否出现新名词或者新概念。基于思想建立在文字的基础上的假设,本研究试图利用生成式语言模型来考察科学行文中的语言使用新颖程度,语言新颖程度与其它新颖性指标的关系,以及语言新颖程度与文章影响力之间的关系。 生成式语言模型因其对语言本身的建模这方面具有独特优势,特别地,基于上下文预测下一个词的概率,可能天然提供了对语言表达新颖程度的衡量。本研究试图提供这一方向上的初步尝试,来更好理解语言使用模式和思想创新之间的关联。

邱慧莲

Quantifying the Link Between Promotional Language and a Scientific Idea’s Funding Potential, Inherent Innovativeness, and Future Impact

邱慧莲,美国西北大学管理学院博士后。研究方向为开源软件团队,Science of Science.

Showing the merits of an original idea opens the door to its use. Though much is known about the generative factors that promote scientific ideas in practice, including funding, teams, diversity, or mentorship, less is known about how scientists communicate the merits of their goods ideas to each other. Indeed, the penchant to view the communication process as an afterthought to scientific idea generation rather than a critical facet of discovery in its own right has inhibited advances within science in areas as critical as vaccinations, climate science, and gene theory. Here, we study a ubiquitous science communication process in grant writing that seeks funding support for new ideas. We examine links between scientific promotional language and an original idea’s funding potential, inherent innovativeness, and future impact. Promotional language has become more prevalent in grants than words reflecting rigor, and may help communicate the originality, methods, and potential future directions of innovative ideas (or exaggerate them). Nevertheless, the influence of promotional language in science is broadly unknown. We analyze tens of thousands of grants from three prominent funding agencies worldwide. Uniquely, this grant application data includes both funded and rejected proposals, mitigating the sample selection bias of most prior studies. Our analyses demonstrate a robust relationship between promotional language and the communication of scientific innovation. First, the number of promotional words in a grant predicts the funding decision with up to a four-fold increase in the probability of receiving funding. Second, promotional words reflect a grant’s degree of innovativeness and expected citation impact. Third, computer experiments that replace the observed promotional words with neutral synonyms but keep all the remaining language intact suggest that promotional words can change reviewers’ overall impressions of a proposal’s quality. With the pivotal role of grants in selecting scientifically promising innovations and nurturing careers, we discuss the implications of our findings for speeding discovery, managing scarce scientific resources, and science policy.

陈旭

基于连续时间的深度Q网络

陈旭,博士毕业于清华大学,曾在英国伦敦大学学院进行博士后研究,于2020年加入中国人民大学。他的研究方向为大语言模型、因果推断、强化学习等。曾在TheWebConf、AIJ、SIGIR、NeurIPS、ICML、TOIS等著名国际会议/期刊发表论文70余篇,Google Scholar引用4600余次。研究成果曾获得TheWebConf 2018最佳论文提名奖、CIKM 2022 最佳资源论文Runner Up 奖、AIRS 2017最佳论文奖、CCF自然科学二等奖(排名第二),ACM-北京新星奖等。他主持/参与了多项国家自然科学基金项目以及企业合作项目。

近年来,强化学习在各类现实应用中取得了不俗的效果。传统强化学习通常将Agent的动作看成离散的动作序列,而没有考虑动作之间的时间间隔。然而很多现实应用中,Agent的动作时间间隔对整体策略的效果是有重要影响的。本次报告中,汇报者将结合其过去的工作介绍如何建模基于连续时间的强化学习问题。首先,介绍如何将连续时间问题刻画成半隐马尔可夫问题,然后利用点过程建模动作序列的时间间隔,最后从理论上分析所提出模型的有限时间复杂度。

吴翼

Diversity-Driven Reinforcement Learning

Yi Wu is now an assistant professor at the Institute for Interdisciplinary Information Sciences (IIIS), Tsinghua University. He obtained his Ph.D. degree from UC Berkeley in 2019 under the supervision of Prof. Stuart Russell and was a researcher at OpenAI before moving back to Tsinghua. His research focuses on improving the generalization capabilities of learning agents. He is broadly interested in a variety of topics in AI, including deep reinforcement learning, multi-agent systems, robot learning and human-AI interaction. His representative works include the value iteration network, the MADDPG/MAPPO algorithm, and OpenAI's hide-and-seek project. He also received the best paper award at NIPS 2016.

In the classical reinforcement learning (RL) formulation, reward is often the only performance metric for an algorithm. Accordingly, existing RL literature primarily focuses on developing algorithms that can achieve high rewards while rarely considering which solution is derived. However, we will show that (optimal) policies with the same reward can yield substantially different behaviors in popular RL testbeds. Moreover, many of the (optimal) behaviors can be hardly discovered by classical RL algorithms that only strive for rewards. In this talk, we present recent advances in developing diversity-driven RL algorithms, which not only optimize rewards but also aim to discover a broad spectrum of policies with visually distinct behaviors.

段雅琦

Policy evaluation in reinforcement learning: Mixing and mis-specification

Yaqi Duan is an Assistant Professor at the Department of Technology, Operations, and Statistics at Stern School of Business at New York University. Duan’s research interests lie at the intersection of statistics, machine learning and operations research, with a focus on new statistical methodologies and theories to address challenges in data-driven decision-making problems. Her work in reinforcement learning was honored with the 2023 IMS Lawrence D. Brown Ph.D. Student Award. Duan earned her Ph.D. from Princeton University and a B.S. in Mathematics from Peking University. Prior to joining NYU Stern, Duan was a postdoctoral researcher at MIT, working with Professor Martin J. Wainwright.

We consider non-parametric estimation of the value function of an infinite-horizon discounted Markov reward process (MRP) using kernel-based least-squares temporal difference (LSTD) method. Our focus is on understanding how temporal dependence in data and model mis-specification impact the quality of estimation. We find that, under a well-specified model, temporal dependence in data has little effect on the estimation error. However, when there is significant model mis-specification, temporal dependence starts to negatively affect estimation quality. Consequently, the choice of how to harness data should depend on our confidence in the model’s fidelity. Our theory includes non-asymptotic upper bounds and matching minimax lower bounds, providing complementary insights into the observed phenomena.

彭天翼

Correcting for Interference in Experiments: A Case Study at Douyin

Tianyi Peng is an (incoming) assistant professor at Decision, Risk, and Operations Division of Columbia University. He is broadly interested in developing algorithms for learning and inference in large-scale dynamic decision-making systems. In particular, he is interested in developing next-generation experimentation/simulation platforms, aiding with reinforcement learning and generative AI, which can provide low-cost solutions for discovering/evaluating beneficial strategies. In translating these ideas, he is engaged with Anheuser-Busch InBev, Takeda Pharmaceuticals, TikTok, and Liberty Mutual. His work has been recognized as a finalist for the MSOM Student Paper Competition, and has won the INFORMS Daniel H. Wagner Prize, Applied Probability Society Best Student Paper Prize, and Jeff McGill Student Paper Award. He earned his Ph.D. from the Massachusetts Institute of Technology (MIT) and his Bachelor's degree from the Yao Class.

Interference is a ubiquitous problem in experiments conducted on two-sided content marketplaces, such as Douyin (China’s analog of TikTok). In many cases, creators are the natural unit of experimentation, but creators interfere with each other through competition for viewers’ limited time and attention. “Naive” estimators currently used in practice simply ignore the interference, but in doing so incur bias on the order of the treatment effect. We formalize the problem of inference in such experiments as one of policy evaluation. Off-policy estimators, while unbiased, are impractically high variance. We introduce a novel Monte-Carlo estimator, based on “Differences-in-Qs” (DQ) techniques, which achieves bias that is second-order in the treatment effect, while remaining sample-efficient to estimate. On the theoretical side, our contribution is to develop a generalized theory of Taylor expansions for policy evaluation, which extends DQ theory to all major MDP formulations. On the practical side, we implement our estimator on Douyin’s experimentation platform, and in the process develop DQ into a truly “plug-and-play” estimator for interference in real-world settings: one which provides robust, low-bias, low-variance treatment effect estimates; admits computationally cheap, asymptotically exact uncertainty quantification; and reduces MSE by 99% compared to the best existing alternatives in our applications.

杨维铠

基于样本代表性的模型性能可视分析及提升方法

杨维铠,清华大学软件学院博士五年级,师从刘世霞教授,研究领域为可视分析和训练数据质量提升。目前已发表CCF-A类期刊与会议论文8篇,其中第一作者或共同第一作者5篇。个人主页https://vicayang.cc.

在机器学习中,数据质量决定了数据分析效果的上限,而模型的改进只是不断逼近这个上限。然而在很多实际应用中,训练数据的质量往往很难保证。本演讲将分享在有标注数据量少、标注质量不高、数据分布不断变化的场景下,通过可视分析手段识别具有代表性的高质量样本,进而提升模型性能的可视分析方法。

谢李文含

情感化动态文字生成

谢李文含目前是香港科技大学计算机科学与工程系的博士研究生,师从可视化领域的知名学者屈华民教授。她的研究兴趣集中在信息可视化与人机交互的交叉领域,尤其关注叙事型可视化动画以及用户在创作过程中的互动体验。

在现代沟通的语境下,AI在叙事型可视化应用,已经成为推动信息和情感传达创新的关键力量。该报告立足于文字这一可视化中的基础图形元素,结合两个相关工作,探讨AI如何丰富动态文字的情感表达。第一个工作通过运动迁移技术辅助生成动态文字。用户只需提供一个表情包动图,AI即可捕捉其动态特征并将其应用到文本上,创造出与原图情感语义一致的动态文字。这一过程不仅突破了传统设计的界限,还为非专业用户提供了创作生动文字动画的可能性。而第二个工作则将动态文字扩展到词云,为其中每个词赋予特定情感的动效,并通过全局参数的调整,让用户能够精确控制情感表达的细节。这不仅提高了情感可视化的准确性,也增强了故事叙述的情感沟通力。在本次讨论中,我们将深入分析这些AI驱动的工具和技术是如何使得个人和专业设计师在创作过程中的工作更为高效,以及它们如何帮助用户更加直观地表达和分享他们的情感。同时,我们也将展示这些技术如何帮助我们更好地理解和利用可视化工具在数字媒体叙事中的作用。

封颖超杰

面向文本生成图像的交互式提示词工程

封颖超杰是浙江大学计算机学院的博士研究生,师从陈为教授。研究兴趣涉及自然语言处理和可视分析的交叉领域。

文生图模型因其生成高质量图像的能力在公众中大受欢迎。然而,由于自然语言的复杂性和模糊性,制定有效的提示词以得到预期的结果是一件具有挑战性的事情。我们提出了一个可视分析系统PromptMagician,来帮助用户探索提示词的图像结果,并迭代优化提示词。系统的核心是一个提示词推荐模型,它以用户提示词作为输入,从 DiffusionDB 中检索相似的提示词-图片对,并识别其中的重要关键词。为了促进用户的交互式提示词探索和改进,PromptMagician引入了多层次可视化技术来展示检索图像和推荐关键词,并支持用户指定多维度的标准进行个性化探索。通过两个使用案例、一项用户研究和专家访谈证明了我们系统的有效性和可用性,表明该系统促进了提示词优化,并提高了文生图模型的创造力支持。

高琳

面向科普教育的交互式Transformer视觉解释分析方法

高琳,复旦大学大数据学院硕士研究生,复旦大学可视分析与智能决策实验室成员,导师为陈思明老师。目前研究方向为交互式教育与领域大模型可视分析,相关工作发表于IEEE VIS 2023。

深度学习的发展激发了人们对获取深度学习模型中知识和技能的兴趣。然而,由于其复杂的网络结构和抽象的数据表示,初学者在学习和理解 Transformer 模型时面临许多困难与挑战。因此,本次报告将分享面向科普教育的交互式视觉解释分析系统 TransforLearn,以交互式的方式帮助初学者更好地理解 Transformer 模型的训练过程。

吴冠德

Socrates:通过自适应机器引导的用户反馈搜集的数据叙事生成

Guande Wu is a Ph.D. candidate in Tandon School of Engineering, New York University. My advisor is Prof. Claudio T. Silva. His research mainly lies in the human-AI collaboration and visual data storytelling. Previously, he has worked with many outstanding experts in visualizatio at Zhejiang University, Tongji University, UC Davis and Microsoft Research Asia and Adobe Research.

数据可视化故事可以有效地传达数据的洞察,但其创建通常需要复杂的数据探索、洞察发现、叙事组织以及根据叙事者的沟通目标进行定制。然而,现有的自动化数据叙事系统往往忽视了在数据叙事创作过程中用户定制的重要性,限制了系统创建符合用户意图的定制叙事的能力。我们提出了一种新颖的数据叙事生成工作流程,该工作流程利用自适应机器引导的用户反馈搜集技术来定制叙事。我们的方法采用了一种自适应的插件模块,用于现有的故事生成系统,通过基于对话历史和数据集的交互式提问来纳入用户反馈。这种适应性提高了系统对用户意图的理解,确保最终的叙事与其目标一致。我们通过一个交互式原型Socrates来演示我们方法的可行性。通过与18名参与者的定量用户研究,我们将我们的方法与最先进的数据故事生成算法进行了比较,结果显示Socrates生成的故事与人工生成的故事在洞察方面有更大的重叠。我们还通过与三位数据分析师的访谈演示了Socrates的可用性,并突出了未来的工作领域。