Me when I was in Edinburgh
Me when I was in Edinburgh

About

I completed the Ph.D in Computer Science (June-2020) at School of Computing & Information Systems (SCIS), Singapore Management University (SMU) (Rank 81 Overrall, Rank 4 in Software Engineering Research on CSRanking).

During my Ph.D, I was fortunate to be advised by Prof. Lingxiao Jiang. I also received tremendous guidance from Prof.Yijun Yu from The Open University, UK. I was grateful to receive the Presidential Doctoral Fellowship in Computing and the SMU Dean’s List for outstanding research achievement. I am the first author of a few publications in top-tier academic conferences across different domains in Computer Science, such as software engineering (ICSE , ESEC/FSE, ASE), artificial intelligence (AAAI), natural language processing (EMNLP, ACL), information retrieval (SIGIR).

I’m also an active open-source contributor, with the majority of my work available on my Github. Notable projects include CodeTF (~1500 stars), CodeT5+ (~2400 stars), CodeCapybara, The Vault.

Throughout my research career, I’ve had the honor of working with brilliant minds and talents from SOAR Group - SMU, FSoft AI Center, Huawei Ireland Research Center, Salesforce AI Research.

Research Interests

My research in AI for Software Development (AI4Code, AI4Software) focuses on creating intelligent tools to help developers with real-world software engineering tasks. My work focuses on developing algorithms to train and fine-tune Large Language Models for code (CodeLLMs). In addition, I investigate the integration of CodeLLMs with multi-agent systems and traditional program analysis methods. This novel approach aims to create coding assistants that integrate seamlessly into the software development lifecycle, improving the overall developer experience. In summary, my research is structured around 4 pillars:

  • Foundation: Developing large language models tailored for coding (CodeLLMs) to set the groundwork for further enhancements.
  • Optimization: Refining CodeLLMs to address challenges like hallucinations and security issues, enhancing trustworthiness, and establishing benchmarking standards.
  • Application: Applying and refining CodeLLMs to software engineering tasks such as code generation, code search, code summarization, program synthesis, automated bug detection & program repair, code migration, software testing, etc.
  • Integration: Seamlessly integrating these models into the software development life cycle to foster effective collaboration between human developers and AI-driven tools, including IDE extensions and low-code/no-code platforms.

Highlighted Publications