[논문소식] Deepmind Sparrow

제목: Improving alignment of dialogue agents via targeted human judgements

딥마인드에서 2022-09-20 에 공개한 언어모델 관련 논문이다. Information-seeking dialogue agent로써 오동작 할 수 있는 부분들(도움이 안되는 정보, 부정확, 해로운 경우들)에 대해서 개선하는 방법들을 소개하고 있다.

논문의 내용은 모델의 agent가 따라야 하는 rule들에 관련한 내용, evidence 관련한 내용 등이 주를 이룬다.

개선 기법에 강화학습(RL)이 적용되어 있다. Reward function 구성에 Preference, Rules, Length and formatting penalties가 사용된다.

Appendix에는 작동 결과, 세부사항, List of rules, potentially harmful examples 등에 대한 내용을 담고 있다.

GLBVISION