news

Nov 2, 2024 :pencil: [SIGMOD 2025] Paper “Automatic Database Configuration Debugging using Retrieval-Augmented Language Models” was accepted.
Sep 26, 2024 :pencil: [NeurIPS 2024] Two papers, “Are Large Language Models Good Statisticians?” and “CRAG - Comprehensive RAG Benchmark”, were accepted by NeurIPS 2024 Datasets and Benchmarks Track.
Sep 20, 2024 :pencil: [EMNLP 2024] Two papers, “ MAR: Matching-Augmented Reasoning for Enhancing Visual-based Entity Question Answering” (main) and “ChartInsights: Evaluating Multimodal Large Language Models for Low-Level Chart Question Answering” (findings), were accepted.
Jul 15, 2024 :pencil: [VLDB 2024] Six papers, (1) “MisDetect: Iterative Mislabel Detection using Early Loss”, (2) “LakeBench: A Benchmark for Discovering Joinable and Unionable Tables in Data Lakes”, (3)”Combining Small Language Models and Large Language Models for Zero-Shot NL2SQL”, (4) “Are Large Language Models a Good Replacement of Taxonomies?”, (5) “HAIChart: Human and AI Paired Visualization System”, (6) “The Dawn of Natural Language to SQL: Are We Fully Ready?”, and two demos, (i) “Retrieval-Based Tabular Data Cleaning Using LLMs and Data Lake”, (ii) “LakeCompass: An End-to-End System for Table Maintenance, Search and Analysis in Data Lakes”, were accepted.
Mar 18, 2024 :pencil: [SIGMOD 2024] Paper “Controllable Tabular Data Synthesis Using Diffusion Models” and two demos, “IDE: A System for Iterative Mislabel Detection” and “CHatPipe: Orchestrating Data Preparation Pipelines by Optimizing Human-ChatGPT Interactions” were accepted.
Mar 10, 2024 :pencil: [ICDE 2024] Two papers, “Mitigating Data Scarcity in Supervised Machine Learning through Reinforcement Learning Guided Data Generation” and “Cost-Effective In-Context Learning for Entity Resolution: A Design Space Exploration”, were accepted.
Mar 8, 2024 :trophy: [KDD Cup 2024] Our proposal “CRAG–Comprehensive RAG Benchmark and Challenge”, co-hosted with Meta Reality Lab, was accepted.
Dec 16, 2023 :medal_sports: [2024 SIGMOD Research Highlight Award] Paper “Unicorn: A Unified Multi-tasking Model for Supporting Matching Tasks in Data Integration” :medal_sports: [Best of SIGMOD 2023] Paper “GoodCore: Data-effective and Data-efficient Machine Learning through Coreset Selection over Incomplete Data”.
Nov 16, 2023 :trophy: [Best of SIGMOD 2023] Paper “GoodCore: Data-effective and Data-efficient Machine Learning through Coreset Selection over Incomplete Data”.
Oct 7, 2023 :pencil: [CIDR 2024] Paper VerifAI: Verified Generative AI was accepted.
Oct 1, 2023 Ph.D. positions available for 2024 spring/fall :sparkles: