A.I./NLP

[Paper Review] Efficient Estimation of Word Representations inVector Space ② | Word Representation ③ Word2Vec

공부하는 2023. 4. 8. 18:44

목차

Abstract
1. Introduction
2. Model Architectures
3. New Log-linear Models
　3.1 Continuous Bag-of-Words Model (CBOW)
　3.2 Continuous Skip-gram Model (Skip-gram)
4. Results
5. Examples of the Learned Relationships
6. Conclusion
7. Follow-Up Work

이번 포스팅은 Word2Vec으로 잘 알려진 <Efficient Estimation of Word Representations inVector Space> 논문 리뷰이다. (사실은 그냥 해석하고 요약..)

포스팅 ② 편은 위와 같은 목차에서 3절, CBOW와 Skip-gram에 대해 작성되었다.

3. New Log-linear Models

대부분의 계산량은 non-linear hidden layer에서 발생 → 개선하자!
계산 복잡도를 최소화하면서 word representation을 학습하기 위한 두 가지 모델 제시 → CBOW, Skip-gram

3.1 Continuous Bag-of-Words Model

NNLM과 유사한 구조
But, 1️⃣ NNLM의 non-linear hidden layer 제거
　　2️⃣ projection에 모든 단어 공유 (이때 단어 순서 무시)
⇒ 단어 벡터의 평균 계산
history, future word를 사용하여 current word 예측
Training Complexitiy: $Q = N*D + D*log_2(V)$

3.2 Continuous Skip-gram Model

current word를 사용하여 surrounding word 예측
current word를 기준으로 주변 단어들 사이 거리에 따라 가중치 부여
Training Complexitiy: $Q = C*(D+D*log(V))$, C: maximum distance of the words
current word를 입력으로 하여 (R + R) word classification, R은 1~C 범위의 숫자