Nan Tang

nan2.jpg

I am an associate professor at Data Science and Analytics Thrust, Information Hub, Hong Kong University of Science and Technology (Guangzhou). I also hold an affiliated position at Hong Kong University of Science and Technology, the Clear Water Bay campus at Hong Kong.

Before joining HKUST(GZ), I worked as a senior scientist at Qatar Computing Research Institute, a visiting scientist at MIT CSAIL, a research fellow at University of Edinburgh, a scientific staff member at CWI (national research institute for mathematics and computer science in the Netherlands), and a visiting scholar at University of Waterloo.

I am directing the Data Intelligence and Analytics Lab (DIAL), which focuses on finding good data and smart analytics that are fundamental to data management, data science and artificial intelligence.

  • Retrieval-based language models using multi-modal data lakes. Data lakes have become increasingly popular for many organizations. Given a natural language question, retrieving datasets (e.g., text, tables, graphs) and reasoning with language models are key for business intelligence.
  • Good data for AI (a.k.a. data-centric AI). For most machine learning practitioners, the success of machine learning projects heavily depends on whether we can find good data for model training.
  • AI for good data. Data scientists spend at least 80% of their time on data preparation. Machine learning models can help address diverse data preparation challenges.
  • Visualization. Data visualization is important to data analytics. I am working on automatic visualization, visualization recommendation, chat-to-story, chat-to-video, and visualization using AR/VR devices.

Office: E3 601
E-mail: nantang (at) hkust-gz.edu.cn
Call: (+86)-20-88330888

news

May 16, 2025 :pencil: [KDD 2025] Paper “NL2SQL-BUGs: A Benchmark for Detecting Semantic Errors in NL2SQL Translation” was accepted by KDD 2025 (Datasets and Benchmarks Track).
May 1, 2025 :pencil: [ICML 2025] Paper “Alpha-SQL: Zero-Shot Text-to-SQL using Monte Carlo Tree Search” was accepted by ICML (poster) 2025.
Apr 30, 2025 :pencil: [IJCAI 2025] Paper “RAMer: Reconstruction-based Adversarial Model for Multi-party Multi-modal Multi-label Emotion Recognition” was accepted by IJCAI 2025.
Apr 16, 2025 :pencil: [VLDB 2025] Paper “Weak-to-Strong Prompts with Lightweight-to-Powerful LLMs for High-Accuracy, Low-Cost, and Explainable Data Transformation” was accepted.
Apr 2, 2025 :pencil: [AIED 2025] Paper “Automatic Modeling and Analysis of Students’ Problem-Solving Handwriting Trajectories” was accepted by The 26th International Conference on Artificial Intelligence in Education.