使用语言模型 (ELMo) 嵌入的词嵌入概述

什么是词嵌入？

它是将单词表示为向量。这些向量捕获有关单词的重要信息，以便在向量空间中共享相同邻域的单词表示相似的含义。创建词嵌入的方法有很多种，例如 Word2Vec、Continuous Bag of Words(CBOW)、Skip Gram、Glove、Elmo 等。

语言模型嵌入（ELMo）：

ELMo 是由 AllenNLP 开发的 NLP 框架。 ELMo 词向量是使用两层双向语言模型 (biLM) 计算的。每层包括前向和后向传递。
与 Glove 和 Word2Vec 不同，ELMo 使用包含该词的完整句子来表示该词的嵌入。因此，ELMo 嵌入能够捕获句子中使用的单词的上下文，并且可以为不同句子中不同上下文中使用的相同单词生成不同的嵌入。

语言模型嵌入（ELMo）

例如： -

我喜欢看电视。
我戴着手表。

在第一句中 watch 用作动词，而在第二句中 watch 用作名词。这些在不同句子中具有不同上下文的词称为多义词。ELMo 可以成功处理 GLOVE 或 FastText 无法捕获的词的这种性质。

使用 ELMo 实现词嵌入：

以下代码在 google colab 上进行了测试。在终端中运行代码之前运行这些命令以安装必要的库。

pip install "tensorflow>=2.0.0"
pip install --upgrade tensorflow-hub

代码：

Python3

# import necessary libraries
import tensorflow_hub as hub
import tensorflow.compat.v1 as tf
tf.disable_eager_execution()
  
# Load pre trained ELMo model
elmo = hub.Module("https://tfhub.dev/google/elmo/3", trainable=True)
  
# create an instance of ELMo
embeddings = elmo(
    [
        "I love to watch TV",
        "I am wearing a wrist watch"
    ],
    signature="default",
    as_dict=True)["elmo"]
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)
  
# Print word embeddings for word WATCH in given two sentences
print('Word embeddings for word WATCH in first sentence')
print(sess.run(embeddings[0][3]))
print('Word embeddings for word WATCH in second sentence')
print(sess.run(embeddings[1][5]))

输出：

Word embeddings for word WATCH in first sentence
[ 0.14079645 -0.15788531 -0.00950466 ...  0.4300597  -0.52887094
  0.06327899]
Word embeddings for word WATCH in second sentence
[-0.08213335  0.01050366 -0.01454147 ...  0.48705393 -0.54457957
  0.5262399 ]

说明：输出显示了在不同句子中不同上下文中使用的同一个词WATCH 的不同词嵌入。