Events
CS MSc Thesis Presentation 19 February 2025
Föreläsning
From:
2025-02-19 13:15
to
14:00
Place: E:4130 (LUCAS)
Contact: birger [dot] swahn [at] cs [dot] lth [dot] se
One Computer Science MSc thesis to be presented on 19 February
Wednesday, 19 February there will be a master thesis presentation in Computer Science at Lund University, Faculty of Engineering.
The presentation will take place in E:4130 (LUCAS).
Note to potential opponents: Register as an opponent to the presentation of your choice by sending an email to the examiner for that presentation (firstname.lastname@cs.lth.se). Do not forget to specify the presentation you register for! Note that the number of opponents may be limited (often to two), so you might be forced to choose another presentation if you register too late. Registrations are individual, just as the oppositions are! More instructions are found on this page.
13:15-14:00 in E:4130 (LUCAS)
Presenter: Victor Tiet
Title: Retrieval-Augmented Generation for Technical Question Answering
Examiner: Jacek Malec
Supervisors: Marcus Klang (LTH), Olof Bengtsson (Softhouse Consulting AB)
Today, many companies have access to technical documentation in various formats. Simultaneously, transformer-based language models have enabled new frameworks and applications. One such application is retrieval-augmented generation (RAG). This thesis explores different RAG configurations for Question Answering in the technical domain and identifies the optimal setup. Additionally, we examine how domain differences can affect performance. To achieve this, experiments are conducted on pre-annotated datasets, TechQA and SQuAD, using retrievers like Jina-Embeddings-v3 and readers like Llama-3. Performance is evaluated through metrics such as Recall@k, F1, and BERTScore. Results indicate that Jina-Embeddings-v3 excels on TechQA (Recall@5=0.625), while BM25 shows strong performance on SQuAD (Recall@5=0.833). Furthermore, Phi-3 Mini-128K-Instruct emerges as the optimal reader for TechQA, achieving a BERTScore of 0.865. While some differences were observed, further research is required to fully understand the impact of domain specificity on RAG performance.
Link to popular science summary: To be uploaded