Hello everyone, I am Haiyang Xu, a master's student in the UBIoT Lab, and my supervisor is Professor Gong Wei.
Thanks to the internship opportunity provided by the lab and the recommendation from teachers and senior students working at Alibaba, I passed one round of HR interview and two rounds of technical interviews and joined the AIDC Dialogue Algorithm Team at Alibaba as an intern in September 2023. I am researching the topic recommended by my supervisor, which is also a topic of concern in the industry.
During my internship, my main project was focused on reducing hallucinations. We used the deepspeed framework and the PPO algorithm to train a 13b model, following the standard RLHF pipeline, which consists of three steps: SFT, RM, and PPO training. We also used RAG to help reduce hallucinations in the model's generated text.
During my internship, I had the opportunity to use an 8-node, 64-card super server and enjoy Alibaba's intern benefits. In terms of research, I learned about distributed programming and gained a deeper understanding of the PPO algorithm and OpenAI's focus on alignment. In terms of business, I learned how to train models on various deep learning clouds and collaborate with colleagues to write code.
In addition to the above, I believe that internships can give us a deeper understanding of how algorithms are applied in practice. I am very grateful to my supervisor and senior students for giving me this valuable opportunity.
Advice for students who just joined the lab: I recommend that after joining the lab, everyone should clearly discuss their study plans with their supervisor. If you are doing research, prioritize working on the topics recommended by your supervisor. I switched to research in April/May 2023, and at that time, my supervisor recommended that I look at some popular open-source projects and frameworks related to large language models, as well as work related to RAG. These experiences happened to match the requirements of the internship position. The topics recommended by supervisors are usually of concern in the industry and can be very helpful for your future career.
Finally, I wish all people a smooth academic journey, good health, and a fulfilling time in the lab!
Fig:related photos