0%

分析学笔记(9)多元微分

本节直接从多元微分着手讨论,涉及的代数知识不再加以定义和证明。\(\require{mathtools}\)

在开始之前,先给出一些约定:

  • \(V\)\(W\)是线性空间,记\(L(V,W)\)\(V\)\(W\)的所有线性变换的集合;
  • \(V\)是线性空间,记\({\rm dim}V\)\(V\)的维度,\({\rm Ker}V\)\(V\)的核,\({\rm Im}V\)\(V\)的象;
  • 未说明时,默认使用\(\mathbb{R}^n\)的标准正交基\(\{\boldsymbol{e}_1,\dots,\boldsymbol{e}_n\}\)(即\(e_i\)的第\(i\)个分量是\(1\)其余是\(0\))和\(\mathbb{R}^m\)的标准正交基\(\{\boldsymbol{u}_1,\dots,\boldsymbol{u}_n\}\),默认使用将\(\mathbb{R}^n\)映入\(\mathbb{R}^m\)的函数\(f\)
  • 任给\(\boldsymbol y\in\mathbb{R}^m\),可被唯一表示为\(\boldsymbol y=\sum\limits_{k=1}^my_k\boldsymbol u_k\),记作\(\boldsymbol y=(y_1,\dots,y_n)\),定义\(f\)的第\(i\)个分量为实值函数\(f_i(\boldsymbol x)\coloneqq \boldsymbol f(\boldsymbol x)\cdot\boldsymbol{u}_i,i=1,\dots,m\)
  • 在第三篇笔记中,我们定义了默认向量的范数为\(l_2\)范数,下文里线性变换的范数和矩阵范数则默认使用其诱导范数(记作\(\Vert\cdot\Vert\)),自然地,以\(d(A,B)=\lVert A-B\rVert\)作为\(L(\mathbb{R}^n,\mathbb{R}^m)\)上的度量;
  • \(\boldsymbol x=(x_1\dots,x_n)\in\mathbb{R}^n\)\(\boldsymbol y=(y_1,\dots,y_m)\in\mathbb{R}^m\),记\((\boldsymbol x,\boldsymbol y)\coloneqq(x_1,\dots,x_n,y_1,\dots,y_m)\in\mathbb{R}^{n+m}\)

一、导数

\(\S1.1\)微分法

Definition 1.1.1 设开集\(E\subset\mathbb{R}^n\)\(\boldsymbol f\colon E\to\mathbb{R}^m\),对\(\boldsymbol{x}\in E\),若存在\(A\in L(\mathbb{R}^n,\mathbb{R}^m)\),使得\(\lim\limits_{\boldsymbol{h}\to\boldsymbol0}\frac{\boldsymbol f(\boldsymbol{x+h})-\boldsymbol f(\boldsymbol{x})-A\boldsymbol h}{|\boldsymbol{h}|}=\boldsymbol0\),则称\(\boldsymbol f\)\(\boldsymbol{x}\)可微,记导数\(\boldsymbol f'(\boldsymbol{x})\coloneqq A\)。若\(f\)在每个\(\boldsymbol{x}\in E\)可微,则称\(f\)\(E\)内可微。

Remark 不难看出,若\(\boldsymbol f\)在某点可微,则一定在该点连续。若\(\boldsymbol f\)\(E\)的每一点都可微,则称\(\boldsymbol f\)\(E\)上可微。另外,不难看出这个定义中的式子有个等价的形式为\(\lim\limits_{\boldsymbol{h}\to\boldsymbol0}\frac{|\boldsymbol f(\boldsymbol{x+h})-\boldsymbol f(\boldsymbol{x})-A\boldsymbol h|}{|\boldsymbol{h}|}=0\)。另外,对实函数\(f\)\(f'\)是一个\(\mathbb{R}\)\(\mathbb{R}\)的线性映射,在标准正交基下我们可以直接将其看作一个实数。

在开始讨论可微性之前,我们先证明\(\boldsymbol f\)在某点可微时导数\(\boldsymbol f'\)是唯一的。

Theorem 1.1.1 在上述定义的记号中,若\(A_1,A_2\)都满足条件,则\(A_1=A_2\)

  • Proof\(B=A_1-A_2\),由定义可知 \[ \lim\limits_{\boldsymbol{h}\to\boldsymbol{0}}\frac{|B\boldsymbol{h}|}{|\boldsymbol{h}|}\leq\lim\limits_{\boldsymbol{h}\to\boldsymbol{0}}\frac{|\boldsymbol f(\boldsymbol{x+h})-\boldsymbol f(\boldsymbol{x})-A_1\boldsymbol{h}|+|\boldsymbol f(\boldsymbol{x+h})-\boldsymbol f(\boldsymbol{x})-A_2\boldsymbol{h}|}{|\boldsymbol{h}|}=0. \]\(\lim\limits_{\boldsymbol{h}\to\boldsymbol{0}}\frac{|B\boldsymbol{h}|}{|\boldsymbol{h}|}=0\)
    \(\forall\boldsymbol{x}\in\mathbb{R}^n,t>0\),有\(\frac{|B\boldsymbol{x}|}{|\boldsymbol{x}|}=\frac{|B(t\boldsymbol{x})|}{|t\boldsymbol{x}|}\)。又由上可知 \[ \lim\limits_{t\to 0}\frac{|B(t\boldsymbol{x})|}{|t\boldsymbol{x}|}=\lim\limits_{t\to 0}\frac{|B\boldsymbol{x}|}{|\boldsymbol{x}|}=0, \]\(\frac{|B\boldsymbol{x}|}{|\boldsymbol{x}|}=0\),从而\(B=0\),则\(A_1=A_2\)\(\Box\)

Remark 对函数\(A\in L(\mathbb{R}^n,\mathbb{R}^m)\),易见\(A'(x)=A\)

接着,我们引入链式法则。

Theorem 1.1.2(链式法则) 设开集\(E\subset\mathbb{R}^n\)\(\boldsymbol f\colon E\to\mathbb{R}^m\)\(F\)是开集且\(\boldsymbol f(E)\subset F\)\(\boldsymbol g\colon F\to\mathbb{R}^k\)\(\boldsymbol f\)\(\boldsymbol f(x_0)\)可微,\(\boldsymbol g\)\(\boldsymbol f(x_0)\)可微,则\((\boldsymbol g\circ \boldsymbol f)'(x_0)=\boldsymbol g'(\boldsymbol f(x_0))\boldsymbol f'(x_0)\)

Remark 定理中\(\boldsymbol g'(\boldsymbol f(x_0))\boldsymbol f'(x_0)\)是两个线性变换的积。

  • Proof\(A=\boldsymbol f'(x_0)\)\(B=\boldsymbol g'(\boldsymbol f(x_0))\)\(\boldsymbol u(\boldsymbol h)=\boldsymbol f(\boldsymbol x_0+\boldsymbol h)-\boldsymbol f(\boldsymbol x_0)-A\boldsymbol h\)\(\boldsymbol v(\boldsymbol h)=\boldsymbol g(\boldsymbol y_0+\boldsymbol h)-\boldsymbol g(\boldsymbol y_0)-B\boldsymbol k\)
    给定\(\boldsymbol h\),令\(\boldsymbol k=\boldsymbol f(\boldsymbol x_0+\boldsymbol h)-\boldsymbol f(\boldsymbol x_0)\),有 \[ \begin{align*} (\boldsymbol g\circ \boldsymbol f)(\boldsymbol x_0+\boldsymbol h)-(\boldsymbol g\circ \boldsymbol f)(\boldsymbol x_0)-BA\boldsymbol h&=\boldsymbol g(\boldsymbol f(\boldsymbol x_0)+\boldsymbol k)-\boldsymbol g(\boldsymbol f(\boldsymbol x_0))-BA\boldsymbol h\\ &=B(\boldsymbol k-A\boldsymbol h)+\boldsymbol v(\boldsymbol k)\\ &=B\boldsymbol u(\boldsymbol h)+\boldsymbol v(\boldsymbol k). \end{align*} \]\(\boldsymbol f\)\(\boldsymbol x_0\)可微,有\(\lim\limits_{\boldsymbol h\to 0}\frac{|\boldsymbol u(\boldsymbol h)|}{|\boldsymbol h|}=0\),即\(\lim\limits_{\boldsymbol h\to 0}|\boldsymbol u(\boldsymbol h)|=0\),又由\(|\boldsymbol k|=|A\boldsymbol h+\boldsymbol u(\boldsymbol h)|\leq\Vert A\Vert|\boldsymbol h|+|\boldsymbol u(\boldsymbol h)|\)\(\lim\limits_{\boldsymbol h\to0}|\boldsymbol k|=0\),同理由\(\boldsymbol g\)\(\boldsymbol f(\boldsymbol x_0)\)可微得\(\lim\limits_{\boldsymbol h\to0}\frac{|\boldsymbol v(\boldsymbol k)|}{|\boldsymbol k|}=\lim\limits_{\boldsymbol k\to0}\frac{|\boldsymbol v(\boldsymbol k)|}{|\boldsymbol k|}=0\),因此\(\lim\limits_{\boldsymbol h\to 0}\frac{|\boldsymbol v(\boldsymbol k)|}{|\boldsymbol h|}=\lim\limits_{\boldsymbol h\to0}\frac{|\boldsymbol v(\boldsymbol k)|}{|\boldsymbol k|}\frac{|\boldsymbol k|}{|\boldsymbol h|}=0\),从而有 \[ \begin{align*} \lim\limits_{\boldsymbol h\to0}\frac{|(\boldsymbol g\circ \boldsymbol f)(\boldsymbol x_0+\boldsymbol h)-(\boldsymbol g\circ \boldsymbol f)(\boldsymbol x_0)-BA\boldsymbol h|}{|\boldsymbol h|}&=\lim\limits_{\boldsymbol h\to0}\frac{|B\boldsymbol u(\boldsymbol h)+\boldsymbol v(\boldsymbol k)|}{|\boldsymbol h|}\\ &\leq\lim\limits_{\boldsymbol h\to0}(\Vert B\Vert\frac{|\boldsymbol u(\boldsymbol h)|}{|\boldsymbol h|}+\frac{|\boldsymbol v(\boldsymbol k)|}{|\boldsymbol h|})\\ &=0. \end{align*} \]\(\boldsymbol g\circ \boldsymbol f\)\(\boldsymbol x_0\)可微,且\((\boldsymbol g\circ \boldsymbol f)'(x_0)=BA=\boldsymbol g'(\boldsymbol f(x_0))\boldsymbol f'(x_0)\)\(\Box\)

\(\S1.2\) \(\mathbb R\)上的微分

我们先来探寻一下对实函数,求导运算具有哪些性质。

对实函数\(f\),若其在\(x\)的导数为\(f'(x)\),我们更习惯将\(f'(x)\)看作一个实数处理,因此也将\(f'\)看作一个实函数,则\(\lim\limits_{h\to0}\frac{|f(x+h)-f(x)-f'(x)h|}{|h|}=0\),即\(\lim\limits_{h\to0}\frac{f(x+h)-f(x)}{h}=f'(x)\),以下运算法则是显而易见的。

Theorem 1.2.1\(f(x)\)\(g(x)\)\((a,b)\)上的实函数且在\(x_0\in(a,b)\)可微,则\(f(x)\pm g(x)\)\(f(x)g(x)\)\(f(x)/g(x)\)都在\(x\)可微,且
(a) \((f(x)\pm g(x))'=f'(x)\pm g'(x)\)
(b) \([f(x)g(x)]'=f'(x)g(x)+f(x)g'(x)\)
(c) \([f(x)/g(x)]'=[f'(x)g(x)-f(x)g'(x)]/g^2(x)\)

另外,由定义,\(f\)的导数与极值和单调性的关系是显而易见,讨论起来太过琐碎,就不再列出。

下面我们引入将函数与其导数联系起来的中值定理。

Theorem 1.2.2(Rolle定理)\(f\)\([a,b]\)上的连续实函数且在\((a,b)\)上可微,\(f(a)=f(b)\),则\(\exists\xi\in(a,b)\),使得\(f'(\xi)=0\)

  • Proof\(f\)\([a,b]\)上连续可设\(f\)\([a,b]\)上的最大值和最小值分别为\(M\)\(m\),若\(M=m\),则\(f\)为常值函数,定理显然成立。
    考虑\(M\neq m\),由\(f(a)=f(b)\),则两个区间端点的函数值不可能同时为\(M\)\(m\)。不妨设\(\exists\xi\in(a,b)\),使得\(f(\xi)=M\),则有\(f(\xi)-f(\xi+h)\geq0\)恒成立。又由\(f\)\(\xi\)可导得\(\lim\limits_{h\to0}\frac{f(\xi+h)-f(\xi)}{h}=f'(\xi)\)。而\(\frac{f(\xi+h)-f(\xi)}{h}\)\(h>0\)时大于等于\(0\)\(h<0\)时小于等于\(0\),由极限的性质可知\(\lim\limits_{h\to0}\frac{f(\xi+h)-f(\xi)}{h}=0\),即\(f'(\xi)=0\)\(\Box\)

Theorem 1.2.3(Lagrange中值定理)\(f\)\([a,b]\)上的连续实函数且在\((a,b)\)上可微,则\(\exists\xi\in(a,b)\),使得\(f'(\xi)=\frac{f(b)-f(a)}{b-a}\)

  • Proof\(F(x)=f(x)-\frac{f(b)-f(a)}{b-a}x\),易见\(F(b)=F(a)\),且\(F\)\([a,b]\)上连续,\((a,b)\)上可微,则有Rolle定理得\(\exists\xi\in(a,b)\),使得\(F'(\xi)=0\),即\(f'(\xi)=\frac{f(b)-f(a)}{b-a}\)

在后文中我们会构造一个处处连续且处处不可导的实函数,但在这之前,我们可以先揭示一些实数域上连续性和可微性的关系。

Theorem 1.2.4(Darboux定理)\(f\)\([a,b]\)上的可微实函数且\(f'(a)<\lambda<f'(b)\),则\(\exists\xi\in(a,b)\),使得\(f'(\xi)=\lambda\)

  • Proof\(g(x)=f(x)-\lambda x\),有\(g'(a)<0\),则\(\exists x_1\),使得\(g(x_1)>g(a)\),同理\(\exists x_2\)使得\(g(x_2)<g(b)\),因此\(g\)\(x_1\)\(x_2\)之间的某点\(\xi\)取到最小值,由导数与极值的关系可知\(g'(\xi)=0\),即\(f'(\xi)=\lambda\)\(\Box\)

下面我们给出利用导数计算某些极限的L'Hospital法则。

Theorem 1.2.5(L'Hospital法则)\(f\)\(g\)是实函数,\(\exists\delta>0\)使得在\((a-\delta,a)\cup(a,a+\delta)\)
(a) \(f\)\(g\)有定义且可微;
(b) \(\lim\limits_{x\to a}f(x)=\lim\limits_{x\to a}g(x)=0\)
(c) \(g'(x)\neq0\)\(\lim\limits_{x\to a}\frac{f'(x)}{g'(x)}=A\)
\(\lim\limits_{x\to a}\frac{f(x)}{g(x)}=A\),其中\(A\)可以为实数或\(\pm\infty\)

我们将一个可微实函数的导数视作是实函数,则其导数也可能可微,由此我们引入实函数的高阶导数。

Definition 1.2.1\(f\)\(X\)上的可微实函数,不妨记其导数为\(f^{(1)}\),称为一阶导数。\(\forall n\in\mathbb{N}^+\),若\(f\)存在\(n\)阶导数\(f^{(n)}\)且在\(X\)上可微,则定义其\(n+1\)阶导数为\(f^{(n+1)}\)(同样地,我们也将\(n+1\)阶导数看作实函数)。特殊地,记\(f^{(0)}=f\)

Remark\(f^{(n)}\)\(x\)处存在,则\(f^{(n-1)}\)必在\(x\)的某个邻域上存在。

借此,我们可以引出Taylor定理。

Theorem 1.2.6(Taylor定理)\(f\)是实函数,且在定义域中一点\(x_0\)处有\(n\)阶导的,定义其Taylor多项式为\(T_n(x)=\sum\limits_{k=0}^n\frac{f^{(k)}(x_0)}{k!}(x-x_0)^k\),则\(f(x)=T_n(x)+o((x-x_0)^n)\)

  • Proof\(R_n(x)=f(x)-T_n(x)\)\(Q_n(x)=(x-x_0)^n\),则只需证\(\lim\limits_{x\to x_0}\frac{R_n(x)}{Q_n(x)}=0\)

    注意到\(T^{(k)}_n(x)=f^{(k)}(x),k=0,1,\dots,n\),则\(R_n^{(k)}(x)=0,k=0,1,\dots,n\)。另外,\(Q^{(k)}_n(x_0)=0,k=0,1,\dots,n-1\)\(Q^{(n)}_n(x_0)=n!\)。由\(f^{(n)}(x_0)\)存在可知,\(f\)\(x_0\)的某个邻域上存在\(n-1\)阶导,则由L'Hospital法则,有 \[ \begin{align*} \lim_{x\to x_0}\frac{R_n(x)}{Q_n(x)}&=\lim_{x\to x_0}\frac{R'_n(x)}{Q'_n(x)}=\cdots=\lim_{x\to x_0}\frac{R^{(n-1)}_n(x)}{Q^{(n-1)}_n(x)}\\ &=\lim_{x\to x_0}\frac{f^{(n-1)}(x)-f^{(n-1)}(x_0)-f^{(n)}(x_0)(x-x_0)}{n!(x-x_0)}\\ &=\frac{1}{n!}\lim_{x\to x_0}[\frac{f^{(n-1)}(x)-f^{(n-1)}(x_0)}{x-x_0}-f^{(n)}(x_0)]\\ &=0. \end{align*} \]\(f(x)=T_n(x)+o((x-x_0)^n)\)\(\Box\)

事实上,这个定理只反映\(x\to x_0\)时的逼近情况,更一般地,我们有如下定理。

Theorem 1.2.7\(f\)\([a,b]\)上有\(n\)阶连续导数的实函数,且在\((a,b)\)上有\(n+1\)阶导数,则\(\forall x,x_0\in[a,b],\exists\xi\in(a,b)\),使得\(f(x)=T_n(x)+\frac{f^{(n+1)}(\xi)}{(n+1)!}(x-x_0)^{n+1}\),其中\(T_n\)定义同上述定理。

  • Proof \(x=x_0\)是显然的,考虑\(x\neq x_0\),不妨设\(x<x_0\),令 \[ M=\frac{f(x)-T_n(x)}{(x-x_0)^{n+1}},F(t)=f(t)-\sum\limits_{k=0}^n\frac{f^{(k)}(x)}{k!}(t-x)^k-M(t-x)^{n+1}, \] 不难看出\(F^{(k)}(x)=0,k=0,1,\dots,n\),且\(F(x_0)=0\),则由Rolle定理,\(\exists x_1\in(x,x_0)\),使得\(F^{(1)}(x_1)\)。类似地,反复使用Rolle定理,\(\exists\xi\in(x,x_n)\),使得\(F^{(n+1)}(\xi)=f^{(n+1)}(\xi)-(n+1)!M=0\),即\(M=\frac{f^{(n+1)}(\xi)}{(n+1)!}\)\(\Box\)

最后,我们结合函数级数给出一个处处连续且处处不可微的实函数。

Theorem 1.2.8 存在处处连续且处处不可微的连续实函数。

  • Proof 定义 \[ \phi(x)\coloneqq\begin{cases}|x|,&-1\leq x\leq1,\\ \phi(x+2),&x<-1,\\ \phi(x-2),&x>1. \end{cases} \] 不难看出\(\phi\)\(\mathbb{R}\)上连续且\(\forall x,y\in\mathbb{R},\lvert\phi(x)-\phi(y)\rvert\leq\lvert x-y\rvert\)
    定义\(f(x)\coloneqq\sum\limits_{n=0}^\infty\left(\frac34\right)^n\phi(4^nx)\),由\(\phi(x)\in[0,1]\)\(f\)\(\mathbb{R}\)上一致收敛,从而\(f\)连续。
    \(\forall x\in\mathbb{R}\),对任一正整数\(m\),令\(\delta_m=\pm\frac{1}{2^{2m+1}}\),其中正负号取使得\((4^mx,4^m(x+\delta_m))\)中不存在整数的符号,注意到\(4^m\delta_m=\frac12\),所以\(\delta_m\)存在且唯一。
    \(\alpha_n=\frac{\phi(4^n(x+\delta_m))-\phi(4^nx)}{\delta_m}\)\(n>m\)时,\(4^n\delta_m\)为偶数,故\(\alpha_n=0\),而\(n\leq m\)时,\(\lvert\alpha_n\rvert\leq 4^n\),从而有 \[ \left\lvert\frac{f(x+\delta_m)-f(x)}{\delta _m}\right\rvert=\left\lvert\sum_{n=0}^\infty\left(\frac34\right)^n\alpha_n\right\rvert\geq3^m-\sum_{n=0}^{m-1}3^n=\frac12(3^n+1), \] 这就说明了\(f\)不可微,从而我们构造了一个处处连续且处处不可微的连续实函数。 \(\Box\)

最后给出导数与凸性的关系。

Theorem 1.2.9\(f\)\((a,b)\)上可微,则下述命题等价:
(a) \(f\)\((a,b)\)上凸函数;
(b) \(f'\)\((a,b)\)上递增;
(c) \(\forall x,y\in(a,b)\)\(f(y)\geq f(x)+f'(x)(y-x)\)

结合导数和凸函数的定义,三个命题之间的互推都是显然的。当\(f'\)\((a,b)\)上可微时,我们还能导出如下定理。

Theorem 1.2.10\(f\)\((a,b)\)上有二阶导数,则\(f\)\((a,b)\)上是凸函数当且仅当\(f''(x)\geq0,\forall x\in(a,b)\)

\(\S1.3\) 方向导数和偏导数

Definition 1.3.1 设开集\(E\subset\mathbb{R}^n\)\(\boldsymbol f\colon E\to\mathbb{R}^m\)\(\boldsymbol v\in\mathbb{R}^n\),对\(\boldsymbol x\in E\),定义\(\boldsymbol f\)关于\(\boldsymbol v\)的方向导数为\(\frac{\partial\boldsymbol f}{\partial\boldsymbol v}(\boldsymbol x)\coloneqq\lim\limits_{t\to 0}\frac{\boldsymbol f(\boldsymbol x+t\boldsymbol v)-\boldsymbol f(\boldsymbol x)}{t}\)

Definition 1.3.2 设开集\(E\subset\mathbb{R}^n\)\(\boldsymbol f\colon E\to\mathbb{R}^m\),对\(\boldsymbol x\in E\),定义\(\boldsymbol f\)关于\(x_j\)的偏导为\(\frac{\partial\boldsymbol f}{\partial x_j}(\boldsymbol x)\coloneqq\frac{\partial\boldsymbol f}{\partial\boldsymbol e_j}(\boldsymbol x)\)

Definition 1.3.3 设开集\(E\subset\mathbb{R}^n\)\(f\colon E\to\mathbb{R}\),定义\(f\)\(\boldsymbol x\in E\)处的梯度为\(n\)维行向量\(\nabla f(\boldsymbol x)\coloneqq(\frac{\partial f}{\partial x_1}(\boldsymbol x),\dots,\frac{\partial f}{\partial x_n}(\boldsymbol x))\)

事实上,由欧氏空间上函数极限的一些结论,\(\boldsymbol f\)\(\boldsymbol x\)处关于\(\boldsymbol v\)的方向导数存在,当且仅当其所有分量关于\(\boldsymbol v\)的方向导数存在,且有\(\frac{\partial\boldsymbol f}{\partial \boldsymbol v}(\boldsymbol x)=(\frac{\partial f_1}{\partial \boldsymbol v}(\boldsymbol x),\dots,\frac{\partial f_m}{\partial \boldsymbol v}(\boldsymbol x))\)。即,对一个向量值函数求导,只需对其每一个分量求导。而对于变量为实数的函数,求导和求偏导是一样的。

另外,在某一点可微蕴含了方向导数存在。

Theorem 1.3.1 设开集\(E\subset\mathbb{R}^n\)\(\boldsymbol f\colon E\to\mathbb{R}^m\)\(\boldsymbol f\)\(\boldsymbol x\in E\)可微,则\(\boldsymbol f\)关于\(\boldsymbol v\in\mathbb{R}^n\)\(\boldsymbol x\)的方向导数存在,且\(\frac{\partial\boldsymbol f}{\partial\boldsymbol v}(\boldsymbol x)=\boldsymbol f'(\boldsymbol x)\boldsymbol v\)

  • Proof 由可微的定义,把\(\boldsymbol h\)限制在\(t\boldsymbol v(t\in\mathbb{R})\)上,即可得到\(\frac{\partial\boldsymbol f}{\partial\boldsymbol v}(\boldsymbol x)=\lim\limits_{t\to0}\frac{\boldsymbol f(\boldsymbol x+t\boldsymbol v)-\boldsymbol f(\boldsymbol x)}{t}=\boldsymbol f'(\boldsymbol x)\boldsymbol v\)\(\Box\)

Corollary 在上述定理条件下,当\(f\)是实值函数时,\(\frac{\partial f}{\partial\boldsymbol v}(\boldsymbol x)=\nabla f(\boldsymbol x)\cdot\boldsymbol v\)

但即使关于\(x_1,\dots,x_n\)的偏导都存在,\(\boldsymbol f\)也可能在\(\boldsymbol x\)处不可微,后文我们会给出在偏导数均连续时的情况。

\(\boldsymbol f\)用分量形式表示,我们有如下推论。

Corollary 设开集\(E\subset\mathbb{R}^n\)\(\boldsymbol f\colon E\to\mathbb{R}^m\)\(\boldsymbol f\)\(\boldsymbol x\in E\)可微,则\(\boldsymbol f'(\boldsymbol x)\boldsymbol e_j=\sum\limits_{i=1}^m\frac{\partial f_i}{\partial x_j}(\boldsymbol x)\boldsymbol u_i\)

由该推论,记\(\boldsymbol f'(\boldsymbol x)\)\(\mathbb{R}^n\)的标准正交基下的矩阵为\([\boldsymbol f'(\boldsymbol x)]\),则有 \[ [\boldsymbol f'(\boldsymbol x)]=\begin{pmatrix} \frac{\partial f_1}{\partial x_1}(\boldsymbol x)&\frac{\partial f_1}{\partial x_2}(\boldsymbol x)&\cdots&\frac{\partial f_1}{\partial x_n}(\boldsymbol x)\\ \frac{\partial f_2}{\partial x_1}(\boldsymbol x)&\frac{\partial f_2}{\partial x_2}(\boldsymbol x)&\cdots&\frac{\partial f_2}{\partial x_n}(\boldsymbol x)\\ \vdots&\vdots&&\vdots\\ \frac{\partial f_m}{\partial x_1}(\boldsymbol x)&\frac{\partial f_m}{\partial x_2}(\boldsymbol x)&\cdots&\frac{\partial f_m}{\partial x_n}(\boldsymbol x) \end{pmatrix}=\begin{pmatrix}\frac{\partial\boldsymbol f}{\partial x_1}(\boldsymbol x)^{\rm T},\frac{\partial\boldsymbol f}{\partial x_2}(\boldsymbol x)^{\rm T},\dots,\frac{\partial\boldsymbol f}{\partial x_n}(\boldsymbol x)^{\rm T}\end{pmatrix}=\begin{pmatrix}\nabla f_1(\boldsymbol x)\\\nabla f_2(\boldsymbol x)\\\vdots\\\nabla f_m(\boldsymbol x)\end{pmatrix}. \] Definition 1.3.4 设开集\(E\subset\mathbb{R}^n\)\(\boldsymbol f\colon E\to\mathbb{R}^m\)且在\(E\)上可微,若\(\boldsymbol f'\)连续,则称\(\boldsymbol f\)\(E\)上连续可微,记所有这样的映射的集合为\(\mathscr C'(E,\mathbb{R}^m)\)

Theorem 1.3.2 设开集\(E\subset\mathbb{R}^n\)\(\boldsymbol f\colon E\to\mathbb{R}^m\),则\(\boldsymbol f\in \mathscr C'(E,\mathbb{R}^m)\)当且仅当\(\boldsymbol f\)的所有偏导数\(\frac{\partial\boldsymbol f}{\partial x_k},k=1,\dots,n\)都连续。

  • Proof 由Theorem 1.3.1可知,对\(k=1,\dots,n\)\(\frac{\partial\boldsymbol f}{\partial x_k}(\boldsymbol x)=\boldsymbol f'(\boldsymbol x)\boldsymbol e_k\),从而有 \[ \lvert\frac{\partial\boldsymbol f}{\partial x_k}(\boldsymbol x)-\frac{\partial\boldsymbol f}{\partial x_k}(\boldsymbol y)\rvert\leq\lVert\boldsymbol f'(\boldsymbol x)-\boldsymbol f'(\boldsymbol y)\rVert, \]

    则不难看出当\(\boldsymbol f\in \mathscr C'(E,\mathbb{R}^m)\)时,\(\frac{\partial\boldsymbol f}{\partial x_k}(\boldsymbol x)\)连续。
    对逆命题,由可微的定义可知\(\boldsymbol f\)可微当且仅当其每一个分量都可微,因此我们只需考虑\(m=1\)的实值函数\(f\)
    任取\(\boldsymbol x\in E\)\(\forall\varepsilon>0\),对\(k=1,\dots,n\),由\(\frac{\partial f}{\partial x_k}\)的连续性及\(E\)是开集可知,\(\exists\delta>0\),使得\(B_\mathbb{R^n}(\boldsymbol x,\delta)\subset E\),且\(\forall \boldsymbol y\in B_\mathbb{R^n}(\boldsymbol x,\delta),|\frac{\partial f}{\partial x_k}(\boldsymbol x)-\frac{\partial f}{\partial x_k}(\boldsymbol y)|<\frac\varepsilon n\)。任取满足\(|\boldsymbol h|<\delta\)\(\boldsymbol h=(h_1,\dots,h_n)\),令\(\boldsymbol v_0=\boldsymbol 0\)\(\boldsymbol v_k=\sum\limits_{i=1}^k h_i\boldsymbol e_i,k=1,\dots,n\),则有\(\lvert\boldsymbol v_k\rvert\leq\vert\boldsymbol h\rvert<\delta\)\[ f(\boldsymbol x+\boldsymbol h)-f(\boldsymbol x)=\sum\limits_{k=1}^n[f(\boldsymbol x+\boldsymbol v_k)-f(\boldsymbol x+\boldsymbol v_{k-1})]. \] 对所有\(k=1,\dots,n\),由\(B_\mathbb{R^n}(\boldsymbol x,\delta)\)是凸集,则\(\boldsymbol x+\boldsymbol v_{k-1}+th_k\boldsymbol e_j,t\in[0,1]\)恒在\(B_\mathbb{R^n}(\boldsymbol x,\delta)\)中。令\(g_k(t)=f(\boldsymbol x+\boldsymbol v_{k-1}+th_k\boldsymbol e_k),t\in[0,1]\),不难看出\(g_k\)\(t\)求导即\(f\)\(x_k\)求偏导,则由中值定理可以得到 \[ f(\boldsymbol x+\boldsymbol v_k)-f(\boldsymbol x+\boldsymbol v_{k-1})=h_k\frac{\partial f}{\partial x_k}(\boldsymbol x+\boldsymbol v_{k-1}+\theta_kh_k\boldsymbol e_k), \] 其中\(\theta_k\in(0,1)\)
    又有\(\lvert \frac{\partial f}{\partial x_k}(\boldsymbol x+\boldsymbol v_{k-1}+\theta_kh_k\boldsymbol e_k)-\frac{\partial f}{\partial x_k}(\boldsymbol x)\rvert<\frac\varepsilon n\),则对任意满足\(\lvert\boldsymbol h\rvert<\delta\)\(\boldsymbol h\),都有 \[ \lvert f(\boldsymbol x+\boldsymbol h)-f(\boldsymbol x)-\sum\limits_{k=1}^nh_k\frac{\partial f}{\partial x_k}(\boldsymbol x)\rvert\leq\frac1 n\sum_{k=1}^n\lvert h_k\rvert\varepsilon\leq\lvert\boldsymbol h\rvert\varepsilon, \]\(f\)\(\boldsymbol x\)处可微,且\(f'(\boldsymbol x)=(\frac{\partial f}{\partial x_1}(\boldsymbol x),\dots,\frac{\partial f}{\partial x_n}(\boldsymbol x))\),由其每一个分量都是连续函数得\(f\in \mathscr C'(E,\mathbb{R})\)\(\Box\)

Theorem 1.3.3 设凸开集\(E\subset\mathbb{R}^n\)\(\boldsymbol f\colon E\to\mathbb{R}^m\),若\(\boldsymbol f\)\(E\)上可微且存在实数\(M\),使得\(\lVert\boldsymbol f'(\boldsymbol x)\rVert\leq M,\forall\boldsymbol x\in E\),则\(\forall\boldsymbol x,\boldsymbol y\in E,\lvert\boldsymbol f(\boldsymbol x)-\boldsymbol f(\boldsymbol y)\rvert\leq M\lvert\boldsymbol x-\boldsymbol y\rvert\)

  • Proof 对任意\(\boldsymbol x,\boldsymbol y\in E\),令\(\boldsymbol g(t)=\boldsymbol f((1-t)\boldsymbol x+t\boldsymbol y),t\in[0,1]\),有\(\boldsymbol g'(t)=\boldsymbol f'((1-t)\boldsymbol x+t\boldsymbol y)(\boldsymbol y-\boldsymbol x)\),则\(\lvert\boldsymbol g'(t)\rvert=\lVert\boldsymbol f'((1-t)\boldsymbol x+t\boldsymbol y)\rVert\lvert(\boldsymbol y-\boldsymbol x)\rvert\leq M\lvert\boldsymbol x-\boldsymbol y\rvert,\forall t\in[0,1]\)
    \[ \boldsymbol v=\boldsymbol g(1)-\boldsymbol g(0)=\boldsymbol f(\boldsymbol x)-\boldsymbol f(\boldsymbol y),h(t)=\boldsymbol v\cdot\boldsymbol g(t),t\in[0,1], \] 则由中值定理,有\(h(1)-h(0)=h'(\theta)=\boldsymbol v\cdot\boldsymbol g'(\theta)\)。又有\(h(1)-h(0)=\boldsymbol v\cdot\boldsymbol v\),则\(\lvert\boldsymbol v\rvert^2=\lvert\boldsymbol v\cdot\boldsymbol g'(\theta)\rvert\leq\lvert\boldsymbol v\rvert\lvert\boldsymbol g'(\theta)\rvert\),即\(\lvert\boldsymbol f(\boldsymbol x)-\boldsymbol f(\boldsymbol y)\rvert\leq\lvert\boldsymbol g'(\theta)\rvert\leq M\lvert\boldsymbol x-\boldsymbol y\rvert\)\(\Box\)

二、反函数与隐函数

接着我们将前文讨论的多元微分运用在反函数和隐函数上,在此之前我们先给出压缩映射定理这一工具,再以此证明反函数定理与隐函数定理。

\(\S2.1\) 压缩映射

Definition 2.1.1\((X,d)\)是度量空间,\(\varphi\colon X\to X\),若存在\(c<1\)\(\forall x,y\in X\),有\(d(\varphi(x),\varphi(y))<cd(x,y)\),则称\(\varphi\)\(X\)上的一个压缩映射。

Remark 由压缩映射的定义不难看出,\(\varphi\)是一致连续的。

Theorem 2.1.1(压缩映射定理)\((X,d)\)是完备度量空间,\(\varphi\)\(X\)上的一个压缩压缩,则存在唯一一点\(x\in X\),使得\(\varphi(x)=x\)

  • Proof 任取\(x_0\in X\),由\(x_{n+1}=\varphi(x_n)\)递归构造序列\(\{x_n\}\),选取满足压缩映射条件的\(c(<1)\),有 \[ d(x_{n+1},x_n)=d(\varphi(x_n),\varphi(x_{n-1}))\leq cd(x_n,x_{n-1}),\forall n\in\mathbb{N^+}, \]\(d(x_{n+1},x_n)\leq c^nd(x_1,x_0)\)。则对任意\(n>m\),有 \[ d(x_n,x_m)\leq\sum\limits_{k=m}^{n-1}d(x_{k},x_{k+1})\leq\sum\limits_{k=m}^{n-1}c^kd(x_0,x_1)\leq\frac{c^n}{1-c}d(x_1,x_0). \] 则不难看出\(\{x_n\}\)是Cauchy序列,因此可设\(\{x_n\}\)收敛于\(x\)
    又由\(\varphi\)连续,\(\varphi(x)=\lim\limits_{n\to\infty}\varphi(x_n)=\lim\limits_{n\to\infty}x_{n+1}=x\)\(\Box\)

Remark 如果\((X,d)\)不是完备的,则至多有一点\(x\in X\),使得\(\varphi(x)=x\)

\(\S2.2\) 反函数定理

先不加证明地给出两个线性代数中的引理。

Lemma 2.2.1\(\Omega\)\(L(\mathbb{R}^n,\mathbb{R}^n)\)中的所有可逆线性映射,若\(A\in\Omega\)\(B\in L(\mathbb{R}^n,\mathbb{R}^n)\)\(\lVert B-A\rVert\cdot\lVert A^{-1}\rVert<1\),则\(B\in\Omega\)

Lemma 2.2.2\(\Omega\)\(L(\mathbb{R}^n,\mathbb{R}^n)\)中的所有可逆线性映射,则\(f\colon \Omega\to\Omega,A\mapsto A^{-1}\)是连续的。

接下来,我们将使用上述两个引理和压缩映射定理证明反函数定理。

Theorem 2.2.1(反函数定理) 设开集\(E\subset\mathbb{R}^n\)\(\boldsymbol f\in \mathscr C'(E,\mathbb{R}^n)\),且对\(\boldsymbol a\in E\)\(\boldsymbol f'(\boldsymbol a)\)可逆,令\(\boldsymbol b=\boldsymbol f(\boldsymbol a)\),则存在\(\mathbb{R}^n\)上的开集\(U\)\(V\)满足\(\boldsymbol a\in U\)\(\boldsymbol b\in V\),使得\(\boldsymbol f\)是把\(U\)映入\(V\)的双射,且\(\boldsymbol f^{-1}\in \mathscr C'(V,U)\)

  • Proof\(A=\boldsymbol f'(\boldsymbol a)\),取\(R=\frac1{2\lVert A^{-1}\rVert}\),由\(\boldsymbol f'\)\(\boldsymbol a\)连续,存在以\(\boldsymbol a\)为中心的开球\(U\subset E\),使得 \[ \lVert \boldsymbol f'(\boldsymbol x)-A\rVert<R,\forall \boldsymbol x\in U.\tag{1}\label{9:1} \] 对任一\(\boldsymbol y\in\mathbb{R}^n\),令\(\phi_{\boldsymbol y}(\boldsymbol x)=\boldsymbol x+A^{-1}(\boldsymbol y-\boldsymbol f(\boldsymbol x))\),不难看出\(\boldsymbol f(\boldsymbol x)=\boldsymbol y\)当且仅当\(\phi_{\boldsymbol y}(\boldsymbol x)=\boldsymbol x\)
    又由\(\phi_{\boldsymbol y}'(\boldsymbol x)=I-A^{-1}\boldsymbol f'(\boldsymbol x)=A^{-1}(A-\boldsymbol f'(\boldsymbol x))\),有\(\lVert \phi_{\boldsymbol y}'(\boldsymbol x)\rVert\leq \lVert A^{-1}\rVert\lVert(A-\boldsymbol f'(\boldsymbol x))\rVert<\frac12\),由Theorem 1.3.3得 \[ \lvert\phi_{\boldsymbol y}(\boldsymbol x_1)-\phi_{\boldsymbol y}(\boldsymbol x_1)\rvert\leq\frac12\lvert\boldsymbol x_1-\boldsymbol x_2\rvert,\forall \boldsymbol x_1,\boldsymbol x_2\in U.\tag{2}\label{9:2} \] 从而至多有一个点\(\boldsymbol x\in U\),使得\(\boldsymbol f(\boldsymbol x)=\boldsymbol y\),即\(\boldsymbol f\)是单射。
    \(V=\boldsymbol f(U)\),接着来证明\(V\)是开集。任取\(\boldsymbol y_0\in V\),则存在唯一一个\(\boldsymbol x_0\in U\),使得\(\boldsymbol f(\boldsymbol x_0)=\boldsymbol y_0\)。取一个以\(\boldsymbol x_0\)为中心,半径\(r\)充分小的开球\(B\),使得\(\bar B\subset U\)。任取开球\(B_V(\boldsymbol y_0,Rr)\)中的一个点\(\boldsymbol y\),有 \[ \lvert\phi_{\boldsymbol y}(\boldsymbol x_0)-\boldsymbol x_0\rvert=\lvert A^{-1}(\boldsymbol y-\boldsymbol y_0)\rvert<\lVert A^{-1}\rVert Rr=\frac r2. \]\(\boldsymbol x\in\bar B\)时,结合\(\eqref{9:2}\)式, \[ \lvert\phi_{\boldsymbol y}(\boldsymbol x)-\boldsymbol x_0\rvert\leq\lvert\phi_{\boldsymbol y}(\boldsymbol x)-\phi_{\boldsymbol y}(\boldsymbol x_0)\rvert+\lvert\phi_{\boldsymbol y}(\boldsymbol x_0)-\boldsymbol x_0\rvert<\frac12\lvert\boldsymbol x-\boldsymbol x_0\rvert+\frac r2\leq r. \] 从而有\(\phi_{\boldsymbol y}(\boldsymbol x)\in B\),即\(\phi_{\boldsymbol y}\)\(\bar B\)上的一个压缩映射,从而存在唯一一点\(\boldsymbol x\in\bar B\),使得\(\boldsymbol f(\boldsymbol x)=\boldsymbol y\),即\(\boldsymbol y\in\boldsymbol f(\bar B)\subset\boldsymbol f(U)=V\),这就证明了\(V\)是开集。
    接着,我们证明\(\boldsymbol f^{-1}\in \mathscr C'(V,U)\)。任取\(\boldsymbol y\in V\)和充分小的\(\boldsymbol k\)使得\(\boldsymbol y+\boldsymbol k\in V\),则存在\(\boldsymbol x,\boldsymbol x+\boldsymbol h\in U\),使得\(\boldsymbol f(\boldsymbol x)=\boldsymbol y\)\(\boldsymbol f(\boldsymbol x+\boldsymbol h)=\boldsymbol y+\boldsymbol k\),从而\(\phi_{\boldsymbol y}(\boldsymbol x+\boldsymbol h)-\phi_{\boldsymbol y}(\boldsymbol x)=\boldsymbol h-A^{-1}\boldsymbol k\)。结合\(\eqref{9:2}\)式,有\(\lvert h-A^{-1}\boldsymbol k\rvert\leq\frac12\lvert\boldsymbol h\rvert\),即\(\lvert A^{-1}\boldsymbol k\rvert\geq\frac12\lvert\boldsymbol h\rvert\),从而有 \[ \lvert\boldsymbol h\rvert\leq2\lVert A^{-1}\rVert\lvert\boldsymbol k\rvert=\frac{\lvert\boldsymbol k\rvert}{R}.\tag{3}\label{9:3} \] 结合\(\eqref{9:1}\)式及Lemma 2.2.1,\(\boldsymbol f'(\boldsymbol x)\)可逆,有 \[ \boldsymbol f^{-1}(\boldsymbol y+\boldsymbol k)-\boldsymbol f^{-1}(\boldsymbol y)-\boldsymbol f'(\boldsymbol x)^{-1}\boldsymbol k=-\boldsymbol f'(\boldsymbol x)^{-1}[\boldsymbol f(\boldsymbol x+\boldsymbol h)-\boldsymbol f(\boldsymbol x)-\boldsymbol f'(\boldsymbol x)\boldsymbol h], \] 代入\(\eqref{9:3}\),有 \[ \frac{\lvert\boldsymbol f^{-1}(\boldsymbol y+\boldsymbol k)-\boldsymbol f^{-1}(\boldsymbol y)-\boldsymbol f'(\boldsymbol x)^{-1}\boldsymbol k\rvert}{\lvert\boldsymbol k\rvert}\leq\frac{\lVert\boldsymbol f'(\boldsymbol x)^{-1}\rVert}{R}\cdot\frac{\lvert\boldsymbol f(\boldsymbol x+\boldsymbol h)-\boldsymbol f(\boldsymbol x)-\boldsymbol f'(\boldsymbol x)\boldsymbol h\rvert}{\lvert\boldsymbol h\rvert}. \]\(\boldsymbol f\)的可微性可以看出\(\boldsymbol f^{-1}\)也可微,且\((\boldsymbol f^{-1})'(\boldsymbol y)=\{\boldsymbol f'(\boldsymbol f^{-1}(\boldsymbol y))\}^{-1}\),则\(\boldsymbol f^{-1}\)是连续的,结合\(\boldsymbol f'\)连续及Lemma 2.2.2,\((\boldsymbol f^{-1})'\)也是连续的,从而\(\boldsymbol f^{-1}\in \mathscr C'(V,U)\)\(\Box\)

\(\S2.3\) 隐函数定理

现在,我们利用反函数定理来证明隐函数定理。先给出三个线性代数中的引理。

Lemma 2.3.1\(V\)\(W\)是线性空间,\(A\in L(V,W)\),则\(A\)是单射当且仅当\({\rm Ker}A\)中只有零元。

Lemma 2.3.2\(V\)\(W\)是线性空间,\(A\in L(V,W)\),则\(A\)是双射当且仅当\(A\)是单射。

Lemma 2.3.3\(A\in L(\mathbb{R}^{n+m},\mathbb{R}^n)\),对任意\(\boldsymbol h\in\mathbb{R}^n\)\(\boldsymbol k\in\mathbb{R}^m\),定义\(A_x\boldsymbol h\coloneqq A(\boldsymbol h,\boldsymbol 0)\)\(A_y\boldsymbol k=A(\boldsymbol 0,\boldsymbol k)\),则有\(A_x\in L(\mathbb{R}^n,\mathbb{R}^n),A_y\in L(\mathbb{R}^m,\mathbb{R}^n)\)\(A(\boldsymbol h,\boldsymbol k)=A_x\boldsymbol h+A_y\boldsymbol k\)

Theorem 2.3.1(隐函数定理) 设开集\(E\in\mathbb{R}^{n+m}\)\(\boldsymbol f\in \mathscr C'(E,\mathbb{R}^n)\),且在\((\boldsymbol a,\boldsymbol b)\in E\)\(\boldsymbol f(\boldsymbol a,\boldsymbol b)=\boldsymbol 0\)。令\(A=\boldsymbol f'(\boldsymbol a,\boldsymbol b)\)\(A_x\)\(A_y\)同Lemma 2.3.3定义,若\(A_x\)可逆,则存在开集包含\((\boldsymbol a,\boldsymbol b)\)\(U\subset\mathbb{R}^{n+m}\)和包含\(\boldsymbol b\)的开集\(V\subset\mathbb{R}^m\),恰有唯一的函数\(\boldsymbol g\),使得\(\boldsymbol f(\boldsymbol g(\boldsymbol y),\boldsymbol y)=\boldsymbol 0,\forall\boldsymbol y\in V\)\(\boldsymbol g\in \mathscr C'(V,\mathbb{R}^n)\),其中\(\boldsymbol g'(\boldsymbol b)=-(A_x)^{-1}A_y\)

  • Proof 定义\(\boldsymbol F\colon E\to\mathbb{R}^{n+m},(\boldsymbol x,\boldsymbol y)\mapsto(\boldsymbol f(\boldsymbol x,\boldsymbol y),\boldsymbol y)\),显然\(\boldsymbol F\in \mathscr C'(E,\mathbb{R}^{n+m})\),我们先证明\(\boldsymbol F'(\boldsymbol a,\boldsymbol b)\)可逆。
    \(\boldsymbol r(\boldsymbol h,\boldsymbol k)=\boldsymbol f(\boldsymbol a+\boldsymbol h,\boldsymbol b+\boldsymbol k)-A(\boldsymbol h,\boldsymbol k)\),由\(\boldsymbol f(\boldsymbol a,\boldsymbol b)=\boldsymbol 0\)\[ \boldsymbol F(\boldsymbol a+\boldsymbol h,\boldsymbol b+\boldsymbol k)-\boldsymbol F(\boldsymbol a,\boldsymbol b)=(A(\boldsymbol h,\boldsymbol k),\boldsymbol k)+(\boldsymbol r(\boldsymbol h,\boldsymbol k),\boldsymbol 0), \] 可以看出\(\boldsymbol F'(\boldsymbol a,\boldsymbol b)\)是把\((\boldsymbol h,\boldsymbol k)\)映成\((A(\boldsymbol h,\boldsymbol k),\boldsymbol k)\)的线性映射,若\((\boldsymbol h,\boldsymbol k)\)被映成\(\boldsymbol 0\),则\(\boldsymbol k=\boldsymbol 0\)\(A(\boldsymbol h,\boldsymbol 0)=\boldsymbol 0\),由Lemma 2.3.3,\(\boldsymbol h=\boldsymbol 0\),又由Lemma 2.3.1、2.3.2,\(\boldsymbol F'(\boldsymbol a,\boldsymbol b)\)是双射,则其可逆。
    接着,由反函数定理,存在\(\mathbb{R}^{n+m}\)中的包含\((\boldsymbol a,\boldsymbol b)\)的开集\(U\)和包含\((\boldsymbol 0,\boldsymbol b)\)的开集\(W\),使\(\boldsymbol F\)是把\(U\)映入\(W\)的双射。取\(V=\{\boldsymbol y\in\mathbb{R}^m\colon (\boldsymbol 0,\boldsymbol y)\in W\}\),显然\(V\)也是开集,另外,由\(\boldsymbol F\)定义及其在\(U\)上是双射可知,存在\(\boldsymbol x\in U\),使得\(\boldsymbol F(\boldsymbol x,\boldsymbol y)=(\boldsymbol 0,\boldsymbol y)\),即\(\boldsymbol f(\boldsymbol x,\boldsymbol y)=\boldsymbol 0\),且不难看出这个\(\boldsymbol x\)是唯一的,这就证明了存在唯一的函数\(\boldsymbol g\)满足\(\boldsymbol f(\boldsymbol g(\boldsymbol y),\boldsymbol y)=\boldsymbol 0,\forall\boldsymbol y\in V\)
    下面证明\(\boldsymbol g\in \mathscr C'(V,\mathbb{R}^n)\)。由上述可知\(\boldsymbol F^{-1}(\boldsymbol 0,\boldsymbol x)=(\boldsymbol g(\boldsymbol x),\boldsymbol x)\)。对\(\boldsymbol x\in V\)\(k=1,\dots,m\),注意到\(\frac{\partial\boldsymbol F^{-1}}{\partial\boldsymbol x_{n+k}}(\boldsymbol 0,\boldsymbol x)=(\frac{\partial\boldsymbol g}{\partial\boldsymbol x_k}(\boldsymbol x),\boldsymbol e_k)\),结合反函数定理,\(\boldsymbol F^{-1}\in \mathscr C'(W,U)\),从而\(\boldsymbol g\in \mathscr C'(V,\mathbb{R}^n)\)
    最后,我们证明\(\boldsymbol g'(\boldsymbol b)=-(A_x)^{-1}A_y\)。令\(\boldsymbol G(\boldsymbol x)=(\boldsymbol g(\boldsymbol x),\boldsymbol x)\),不难证明\(\boldsymbol G'(\boldsymbol x)\)是把\(\boldsymbol h\)映入\((\boldsymbol g'(\boldsymbol x)\boldsymbol h,\boldsymbol h)\)的线性映射。由于在\(V\)\(\boldsymbol f(\boldsymbol g(\boldsymbol x),\boldsymbol x)=0\)恒成立,从而\(\boldsymbol f(\boldsymbol G(\boldsymbol x))=0\)。由链式求导,\(\boldsymbol f'(\boldsymbol G(\boldsymbol x))\boldsymbol G'(\boldsymbol x)=0\),注意到\(\boldsymbol G(\boldsymbol b)=(\boldsymbol a,\boldsymbol b)\),则\(A\boldsymbol G'(\boldsymbol b)\)是零映射,再结合Lemma 2.3.3, \[ A\boldsymbol G'(\boldsymbol b)\boldsymbol h=A(\boldsymbol g'(\boldsymbol b)\boldsymbol h,\boldsymbol h)=A_x\boldsymbol g'(\boldsymbol b)\boldsymbol h+A_y\boldsymbol h=\boldsymbol 0, \]\(\boldsymbol g'(\boldsymbol b)=-(A_x)^{-1}A_y\)\(\Box\)