Hello

Aalborg, Denmark, Spring 2023
Aalborg, Denmark, Spring 2023

About Me

I am a ZJU 100-talent Program Professor [浙大百人计划研究员,博士生导师] with the College of Computer Science and Technology at Zhejiang University in China. I am also an NSFC Excellent Young Scientist (Overseas) and an EU Marie Curie Individual Fellow [欧盟"玛丽·居里学者"]. Prior to my current position, I served as an Assistant Professor at the Department of Computer Science at Aalborg University in Denmark from 2020 to 2022 and held the role of Senior Engineer at Ant Group (formerly Alipay) from 2018 to 2019. I earned my Ph.D. from Zhejiang University in 2018 and my B.Sc. from Sichuan University in China in 2012.

Research

I am currently focused on research in data-centric, resource-efficient, and scalable AI, which I believe are essential for advancing the adoption of AI in real-world applications. My earlier work primarily focused on data management and AI techniques in the realm of time-series (TS) and spatiotemporal (ST) data. For now, these interests have broadened to include relational data, unstructured data, and multimodal data. I'm lucky to work and grow with an amazing group of passionate and talented young people at Zhejiang University. Our research group is called Sustainable Data Intelligence and Data Systems [SuDIS]. Lately, we've been focusing on pushing the boundaries in two key areas.
  • Data + AI:
    • data storage [HyperMR, SIGMOD'25], knowledge amalgamation [BoKA, KDD'24], data protection [PIECK, ICDE'24], data selection [CHASe, TKDE'25], data augmentation and synthesis [CogSQL, AAAI'24] for AI model training and serving;
    • AI-empowered data cleansing [MPIN, PVLDB'24; BiSIM, ICDE'23], data discovery [nlcTables, SIGIR'25], and data ingestion [Hippo, SIGMOD'25].
  • Efficient AI:
    • lightweight AI models in edge computing [LightCTS, SIGMOD'23; E2USD, WWW'24; ReCTSi, KDD'24; LightCTS*, TKDE'24] and federated learning [LightTR, ICDE'24; FedBFPT, IJCAI'23];
    • efficient LLM inference [HMI, VLDBJ'25; Draft&Verify, ACL'24] and fine-tuning [LoRAM, ICLR'25].

Quick Access

  • Codebase "ReCTSi: Resource-efficient Correlated Time Series Imputation via Decoupled Pattern Learning and Completeness-aware Attentions" (KDD 2024).
  • Codebase "Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding" (ACL 2024).
  • A systematic review on Tabular Data Augmentation for Machine Learning.
  • Check out the Champion Solution to Hybrid Vector Search (SIGMOD 2024 Programming Contest), w. SUSTech DB Group.
  • Codebase "Missing value imputation for multi-attribute sensor data streams via message propagation" (VLDB 2024).
  • Codebase "LightCTS: A lightweight framework for correlated time series forecasting" (SIGMOD 2023).
  • SIGMOD 2022 Tutorial "Spatial data quality in the IoT era: Management and exploitation" (with B. Tang, H. Lu, M.A. Cheema, and C.S. Jensen).
  • Author Version "Spatial data quality in the Internet of Things: Management, exploitation, and prospects" (CSUR 2022).
  • Benchmark "Indoor spatial queries: Modeling, indexing, and processing" (EDBT 2020).
  • EU Marie Skłodowska-Curie Individual Fellowship "MALOT: Managing Mobility Data Quality for Location of Things".
ZJU AAU DAISY SuDIS MALOT Indoor-LBS