From 68cd62fde66c6b1438f7e88540b3941ccb21ea16 Mon Sep 17 00:00:00 2001
From: zhou201505013 <39976863+zhou201505013@users.noreply.github.com>
Date: Fri, 25 Mar 2022 15:03:43 +0800
Subject: [PATCH 1/6] change 15.2 title in chinese version (#1109)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

change title ’15.2. 情感分析：使用递归神经网络‘ to ’15.2. 情感分析：使用循环神经网络‘
---
 .../sentiment-analysis-rnn.md                                   | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/chapter_natural-language-processing-applications/sentiment-analysis-rnn.md b/chapter_natural-language-processing-applications/sentiment-analysis-rnn.md
index 43f05b3a0..c6f872575 100644
--- a/chapter_natural-language-processing-applications/sentiment-analysis-rnn.md
+++ b/chapter_natural-language-processing-applications/sentiment-analysis-rnn.md
@@ -1,4 +1,4 @@
-# 情感分析：使用递归神经网络
+# 情感分析：使用循环神经网络
 :label:`sec_sentiment_rnn`
 
 与词相似度和类比任务一样，我们也可以将预先训练的词向量应用于情感分析。由于 :numref:`sec_sentiment`中的IMDb评论数据集不是很大，使用在大规模语料库上预训练的文本表示可以减少模型的过拟合。作为 :numref:`fig_nlp-map-sa-rnn`中所示的具体示例，我们将使用预训练的GloVe模型来表示每个词元，并将这些词元表示送入多层双向循环神经网络以获得文本序列表示，该文本序列表示将被转换为情感分析输出 :cite:`Maas.Daly.Pham.ea.2011`。对于相同的下游应用，我们稍后将考虑不同的架构选择。

From 8d6e06682466a45f52f0ef53ea582a1c4b4fb726 Mon Sep 17 00:00:00 2001
From: Xinwei Liu <xinzone@outlook.com>
Date: Fri, 25 Mar 2022 15:08:00 +0800
Subject: [PATCH 2/6] =?UTF-8?q?=E4=BF=AE=E6=94=B9=E9=83=A8=E5=88=86?=
 =?UTF-8?q?=E8=AF=AD=E4=B9=89=E8=A1=A8=E8=BF=B0=20(#1105)?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

---
 chapter_preface/index.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/chapter_preface/index.md b/chapter_preface/index.md
index b05020480..519f30ea8 100644
--- a/chapter_preface/index.md
+++ b/chapter_preface/index.md
@@ -24,9 +24,9 @@
 
 许多教科书教授一系列的主题，每一个都非常详细。例如，Chris Bishop的优秀教科书 :cite:`Bishop.2006` ，对每个主题都教得很透彻，以至于要读到线性回归这一章需要大量的工作。虽然专家们喜欢这本书正是因为它的透彻性，但对于初学者来说，这一特性限制了它作为介绍性文本的实用性。
 
-在这本书中，我们将适时教授大部分概念。换句话说，你将在实现某些实际目的所需的非常时刻学习概念。虽然我们在开始时花了一些时间来教授基础基础知识，如线性代数和概率，但我们希望你在担心更深奥的概率分布之前，先体会一下训练第一个模型的满足感。
+在这本书中，我们将适时教授大部分概念。换句话说，你将在实现某些实际目的所需的非常时刻学习概念。虽然我们在开始时花了一些时间来教授基础的背景知识，如线性代数和概率，但我们希望你在思考更深奥的概率分布之前，先体会一下训练模型的满足感。
 
-除了提供基本数学背景速成课程的几节初步课程外，后续的每一章都介绍了合理数量的新概念，并提供一个独立工作的例子——使用真实的数据集。这带来了组织上的挑战。某些模型可能在逻辑上组合在单节中。而一些想法可能最好是通过连续允许几个模型来传授。另一方面，坚持“一个工作例子一节”的策略有一个很大的好处：这使你可以通过利用我们的代码尽可能轻松地启动你自己的研究项目。只需复制这一节的内容并开始修改即可。
+除了提供基本数学背景速成课程的几节初步课程外，后续的每一章都介绍了适量的新概念，并提供可独立工作的例子——使用真实的数据集。这带来了组织上的挑战。某些模型可能在逻辑上组合在单节中。而一些想法可能最好是通过连续允许几个模型来传授。另一方面，坚持“一个工作例子一节”的策略有一个很大的好处：这使你可以通过利用我们的代码尽可能轻松地启动你自己的研究项目。只需复制这一节的内容并开始修改即可。
 
 我们将根据需要将可运行代码与背景材料交错。通常，在充分解释工具之前，我们常常会在提供工具这一方面犯错误（我们将在稍后解释背景）。例如，在充分解释*随机梯度下降*为什么有用或为什么有效之前，我们可以使用它。这有助于给从业者提供快速解决问题所需的弹药，同时需要读者相信我们的一些决定。
 

From f204e8520934daa43191ca1e26da5fcc96796226 Mon Sep 17 00:00:00 2001
From: Anirudh Dagar <anirudhdagar6@gmail.com>
Date: Fri, 1 Apr 2022 03:23:47 +0530
Subject: [PATCH 3/6] Update r0.17.5 (#1120)

---
 chapter_linear-networks/linear-regression.md  | 15 +++++++++++++--
 .../kaggle-house-price.md                     |  3 +--
 chapter_optimization/optimization-intro.md    | 19 ++++++++++++++++++-
 chapter_preface/index.md                      |  1 +
 chapter_preliminaries/calculus.md             |  8 ++++----
 d2l/mxnet.py                                  |  3 ++-
 d2l/tensorflow.py                             |  3 ++-
 d2l/torch.py                                  |  3 ++-
 setup.py                                      |  4 ++--
 static/build.yml                              |  6 +++---
 10 files changed, 48 insertions(+), 17 deletions(-)

diff --git a/chapter_linear-networks/linear-regression.md b/chapter_linear-networks/linear-regression.md
index b950399fd..fe2fa22f6 100644
--- a/chapter_linear-networks/linear-regression.md
+++ b/chapter_linear-networks/linear-regression.md
@@ -223,7 +223,7 @@ $\eta$表示*学习率*（learning rate）。
 %matplotlib inline
 from d2l import mxnet as d2l
 import math
-import numpy as np
+from mxnet import np
 import time
 ```
 
@@ -350,7 +350,18 @@ def normal(x, mu, sigma):
 我们现在(**可视化正态分布**)。
 
 ```{.python .input}
-#@tab all
+#@tab mxnet
+# 再次使用numpy进行可视化
+x = np.arange(-7, 7, 0.01)
+# Mean and standard deviation pairs
+params = [(0, 1), (0, 2), (3, 1)]
+d2l.plot(x.asnumpy(), [normal(x, mu, sigma).asnumpy() for mu, sigma in params], xlabel='x',
+         ylabel='p(x)', figsize=(4.5, 2.5),
+         legend=[f'mean {mu}, std {sigma}' for mu, sigma in params])
+```
+
+```{.python .input}
+#@tab pytorch, tensorflow
 # 再次使用numpy进行可视化
 x = np.arange(-7, 7, 0.01)
 
diff --git a/chapter_multilayer-perceptrons/kaggle-house-price.md b/chapter_multilayer-perceptrons/kaggle-house-price.md
index 7e846c464..cb60524eb 100644
--- a/chapter_multilayer-perceptrons/kaggle-house-price.md
+++ b/chapter_multilayer-perceptrons/kaggle-house-price.md
@@ -439,7 +439,6 @@ def train(net, train_features, train_labels, test_features, test_labels,
 具体地说，它选择第$i$个切片作为验证数据，其余部分作为训练数据。
 注意，这并不是处理数据的最有效方法，如果我们的数据集大得多，会有其他解决办法。
 
-
 ```{.python .input}
 #@tab all
 def get_k_fold_data(k, i, X, y):
@@ -514,7 +513,7 @@ print(f'{k}-折验证: 平均训练log rmse: {float(train_l):f}, '
 
 ```{.python .input}
 #@tab all
-def train_and_pred(train_features, test_feature, train_labels, test_data,
+def train_and_pred(train_features, test_features, train_labels, test_data,
                    num_epochs, lr, weight_decay, batch_size):
     net = get_net()
     train_ls, _ = train(net, train_features, train_labels, None, None,
diff --git a/chapter_optimization/optimization-intro.md b/chapter_optimization/optimization-intro.md
index bbe3796d7..f6e2cf7ab 100644
--- a/chapter_optimization/optimization-intro.md
+++ b/chapter_optimization/optimization-intro.md
@@ -98,7 +98,24 @@ annotate('saddle point', (0, -0.2), (-0.52, -5.0))
 如下例所示，较高维度的鞍点甚至更加隐蔽。考虑这个函数$f(x, y) = x^2 - y^2$。它的鞍点为$(0, 0)$。这是关于$y$的最大值，也是关于$x$的最小值。此外，它*看起来*像马鞍，这就是这个数学属性的名字由来。
 
 ```{.python .input}
-#@tab all
+#@tab mxnet
+x, y = d2l.meshgrid(
+    d2l.linspace(-1.0, 1.0, 101), d2l.linspace(-1.0, 1.0, 101))
+z = x**2 - y**2
+ax = d2l.plt.figure().add_subplot(111, projection='3d')
+ax.plot_wireframe(x.asnumpy(), y.asnumpy(), z.asnumpy(),
+                  **{'rstride': 10, 'cstride': 10})
+ax.plot([0], [0], [0], 'rx')
+ticks = [-1, 0, 1]
+d2l.plt.xticks(ticks)
+d2l.plt.yticks(ticks)
+ax.set_zticks(ticks)
+d2l.plt.xlabel('x')
+d2l.plt.ylabel('y');
+```
+
+```{.python .input}
+#@tab pytorch, tensorflow
 x, y = d2l.meshgrid(
     d2l.linspace(-1.0, 1.0, 101), d2l.linspace(-1.0, 1.0, 101))
 z = x**2 - y**2
diff --git a/chapter_preface/index.md b/chapter_preface/index.md
index 519f30ea8..8667df566 100644
--- a/chapter_preface/index.md
+++ b/chapter_preface/index.md
@@ -63,6 +63,7 @@ from collections import defaultdict
 from IPython import display
 import math
 from matplotlib import pyplot as plt
+from matplotlib_inline import backend_inline
 import os
 import pandas as pd
 import random
diff --git a/chapter_preliminaries/calculus.md b/chapter_preliminaries/calculus.md
index 006d14cc4..08baf7030 100644
--- a/chapter_preliminaries/calculus.md
+++ b/chapter_preliminaries/calculus.md
@@ -53,7 +53,7 @@
 ```{.python .input}
 %matplotlib inline
 from d2l import mxnet as d2l
-from IPython import display
+from matplotlib_inline import backend_inline
 from mxnet import np, npx
 npx.set_np()
 
@@ -65,7 +65,7 @@ def f(x):
 #@tab pytorch
 %matplotlib inline
 from d2l import torch as d2l
-from IPython import display
+from matplotlib_inline import backend_inline
 import numpy as np
 
 def f(x):
@@ -76,7 +76,7 @@ def f(x):
 #@tab tensorflow
 %matplotlib inline
 from d2l import tensorflow as d2l
-from IPython import display
+from matplotlib_inline import backend_inline
 import numpy as np
 
 def f(x):
@@ -145,7 +145,7 @@ $$\frac{d}{dx} \left[\frac{f(x)}{g(x)}\right] = \frac{g(x) \frac{d}{dx} [f(x)] -
 #@tab all
 def use_svg_display():  #@save
     """使用svg格式在Jupyter中显示绘图"""
-    display.set_matplotlib_formats('svg')
+    backend_inline.set_matplotlib_formats('svg')
 ```
 
 我们定义`set_figsize`函数来设置图表大小。
diff --git a/d2l/mxnet.py b/d2l/mxnet.py
index 70eddb70e..8ed5a5f34 100644
--- a/d2l/mxnet.py
+++ b/d2l/mxnet.py
@@ -19,6 +19,7 @@
 import requests
 from IPython import display
 from matplotlib import pyplot as plt
+from matplotlib_inline import backend_inline
 
 d2l = sys.modules[__name__]
 
@@ -29,7 +30,7 @@ def use_svg_display():
     """使用svg格式在Jupyter中显示绘图
 
     Defined in :numref:`sec_calculus`"""
-    display.set_matplotlib_formats('svg')
+    backend_inline.set_matplotlib_formats('svg')
 
 def set_figsize(figsize=(3.5, 2.5)):
     """设置matplotlib的图表大小
diff --git a/d2l/tensorflow.py b/d2l/tensorflow.py
index 4bd214679..9735d43c0 100644
--- a/d2l/tensorflow.py
+++ b/d2l/tensorflow.py
@@ -19,6 +19,7 @@
 import requests
 from IPython import display
 from matplotlib import pyplot as plt
+from matplotlib_inline import backend_inline
 
 d2l = sys.modules[__name__]
 
@@ -29,7 +30,7 @@ def use_svg_display():
     """使用svg格式在Jupyter中显示绘图
 
     Defined in :numref:`sec_calculus`"""
-    display.set_matplotlib_formats('svg')
+    backend_inline.set_matplotlib_formats('svg')
 
 def set_figsize(figsize=(3.5, 2.5)):
     """设置matplotlib的图表大小
diff --git a/d2l/torch.py b/d2l/torch.py
index 1faef7d32..c092d774d 100644
--- a/d2l/torch.py
+++ b/d2l/torch.py
@@ -19,6 +19,7 @@
 import requests
 from IPython import display
 from matplotlib import pyplot as plt
+from matplotlib_inline import backend_inline
 
 d2l = sys.modules[__name__]
 
@@ -35,7 +36,7 @@ def use_svg_display():
     """使用svg格式在Jupyter中显示绘图
 
     Defined in :numref:`sec_calculus`"""
-    display.set_matplotlib_formats('svg')
+    backend_inline.set_matplotlib_formats('svg')
 
 def set_figsize(figsize=(3.5, 2.5)):
     """设置matplotlib的图表大小
diff --git a/setup.py b/setup.py
index c50e32c8d..1a38f1c75 100644
--- a/setup.py
+++ b/setup.py
@@ -3,8 +3,8 @@
 
 requirements = [
     'jupyter==1.0.0',
-    'numpy==1.22.2',
-    'matplotlib==3.4',
+    'numpy==1.21.5',
+    'matplotlib==3.5.1',
     'requests==2.25.1',
     'pandas==1.2.4'
 ]
diff --git a/static/build.yml b/static/build.yml
index aceba9a76..314d3328e 100644
--- a/static/build.yml
+++ b/static/build.yml
@@ -2,12 +2,12 @@ dependencies:
   - python=3.8
   - pip
   - pip:
-    - d2l==0.17.4
+    - d2l==0.17.5
     - git+https://github.com/d2l-ai/d2l-book
     - mxnet-cu102==1.7.0
-    - torch==1.10.2+cu102
+    - torch==1.11.0+cu102
     - -f https://download.pytorch.org/whl/torch_stable.html
-    - torchvision==0.11.3+cu102
+    - torchvision==0.12.0+cu102
     - -f https://download.pytorch.org/whl/torch_stable.html
     - tensorflow==2.8.0
     - tensorflow-probability==0.16.0

From fef38f18175371591c79a7697faf588eacc94a31 Mon Sep 17 00:00:00 2001
From: Aston Zhang <22279212+astonzhang@users.noreply.github.com>
Date: Thu, 31 Mar 2022 14:56:20 -0700
Subject: [PATCH 4/6] Bump versions in installation

---
 chapter_installation/index.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/chapter_installation/index.md b/chapter_installation/index.md
index 5e95b58ef..739586ed5 100644
--- a/chapter_installation/index.md
+++ b/chapter_installation/index.md
@@ -88,8 +88,8 @@ pip install mxnet==1.7.0.post1
 你可以按如下方式安装PyTorch的CPU或GPU版本：
 
 ```bash
-pip install torch==1.10.2
-pip install torchvision==0.11.3
+pip install torch==1.11.0
+pip install torchvision==0.12.0
 ```
 
 
@@ -109,7 +109,7 @@ pip install tensorflow-probability==0.16.0
 我们的下一步是安装`d2l`包，以方便调取本书中经常使用的函数和类：
 
 ```bash
-pip install d2l==0.17.4
+pip install d2l==0.17.5
 ```
 
 

From 73f2f25a4ac6c07f6a02526766346b4553365d75 Mon Sep 17 00:00:00 2001
From: hugo_han <57249629+HugoHann@users.noreply.github.com>
Date: Wed, 20 Apr 2022 19:20:58 +0800
Subject: [PATCH 5/6] =?UTF-8?q?94=E8=A1=8Ctypo:=20=EF=BC=88=E2=80=9Cbert.m?=
 =?UTF-8?q?all=E2=80=9D=EF=BC=89->=EF=BC=88=E2=80=9Cbert.small=E2=80=9D?=
 =?UTF-8?q?=EF=BC=89=20(#1129)?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

---
 .../natural-language-inference-bert.md                          | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/chapter_natural-language-processing-applications/natural-language-inference-bert.md b/chapter_natural-language-processing-applications/natural-language-inference-bert.md
index c6cc219bc..086d2832b 100644
--- a/chapter_natural-language-processing-applications/natural-language-inference-bert.md
+++ b/chapter_natural-language-processing-applications/natural-language-inference-bert.md
@@ -91,7 +91,7 @@ def load_pretrained_model(pretrained_model, num_hiddens, ffn_num_hiddens,
     return bert, vocab
 ```
 
-为了便于在大多数机器上演示，我们将在本节中加载和微调经过预训练BERT的小版本（“bert.mall”）。在练习中，我们将展示如何微调大得多的“bert.base”以显著提高测试精度。
+为了便于在大多数机器上演示，我们将在本节中加载和微调经过预训练BERT的小版本（“bert.small”）。在练习中，我们将展示如何微调大得多的“bert.base”以显著提高测试精度。
 
 ```{.python .input}
 #@tab all

From 842bcdb822c4288474411dcf8cd5a6a48dd7132a Mon Sep 17 00:00:00 2001
From: hugo_han <57249629+HugoHann@users.noreply.github.com>
Date: Wed, 20 Apr 2022 19:21:46 +0800
Subject: [PATCH 6/6] line 313: "bert.mall" -> "bert.small" (#1130)

---
 .../natural-language-inference-bert.md                          | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/chapter_natural-language-processing-applications/natural-language-inference-bert.md b/chapter_natural-language-processing-applications/natural-language-inference-bert.md
index 086d2832b..1108a7c70 100644
--- a/chapter_natural-language-processing-applications/natural-language-inference-bert.md
+++ b/chapter_natural-language-processing-applications/natural-language-inference-bert.md
@@ -310,7 +310,7 @@ d2l.train_ch13(net, train_iter, test_iter, loss, trainer, num_epochs,
 
 ## 练习
 
-1. 如果您的计算资源允许，请微调一个更大的预训练BERT模型，该模型与原始的BERT基础模型一样大。修改`load_pretrained_model`函数中的参数设置：将“bert.mall”替换为“bert.base”，将`num_hiddens=256`、`ffn_num_hiddens=512`、`num_heads=4`和`num_layers=2`的值分别增加到768、3072、12和12。通过增加微调迭代轮数（可能还会调优其他超参数），你可以获得高于0.86的测试精度吗？
+1. 如果您的计算资源允许，请微调一个更大的预训练BERT模型，该模型与原始的BERT基础模型一样大。修改`load_pretrained_model`函数中的参数设置：将“bert.small”替换为“bert.base”，将`num_hiddens=256`、`ffn_num_hiddens=512`、`num_heads=4`和`num_layers=2`的值分别增加到768、3072、12和12。通过增加微调迭代轮数（可能还会调优其他超参数），你可以获得高于0.86的测试精度吗？
 1. 如何根据一对序列的长度比值截断它们？将此对截断方法与`SNLIBERTDataset`类中使用的方法进行比较。它们的利弊是什么？
 
 :begin_tab:`mxnet`