文献阅读-APY方法

2023-04-14

字数统计: 1.5k | 阅读时长≈ 7 分钟

APY 方法的相应文献。

引言

一步法要对 G 阵求逆，其计算和存储花销是 \(n^3\) 和 \(n^2\) ，因此当基因型数据量很大时计算量很大。

Misztal (2014) 提出了 APY 算法（an algorithm for proven and young animals）来计算 \(\mathbf{G}^{-1}\) ，这里称为 \(\mathbf{G}^{-1}_{APY}\) ，一开始这里的 proven 指的是初始的个体，young 指的是更年轻的个体。

之后的研究发现（Fragomeni, 2015）发现没有必要通过年纪对个体进行排序，因此此时这个算法将基因型个体划分为核心个体 (core) 和非核心个体 (noncore) 。当基因型个体总数为 50万，核心群体数目为 2万是，APY 方法的计算和存储花销分别是常规求逆的 0.3% 和 8% 。

当核心群体数目超过 1万时，APY 方法计算的 GEBV 结果与常规结果非常类似。

之前的重心都关注APY算法的可行性，效率并不是最主要的考虑。这篇文章关注如何研究出一个高效的 APY 算法，并用实际的奶牛数据进行计算。

方法

常规的 G 阵构建方法如下（VanRaden, 2008） \[ \mathbf{G}=\frac{\mathbf{Z Z}^{\prime}}{2 \sum p_j\left(1-p_j\right)} \text { and } \mathbf{Z}=(\mathbf{M}-\mathbf{P}) \] 这里 \(p_{j}\) 是第二个碱基的基因频率（当前的基因型个体），M 矩阵包含所有位点的基因型，P 矩阵包含 \(2p_{j}\) 。

我们将 \(\mathbf{G}\) 矩阵分成 4 块： \[ \mathbf{G}=\left[\begin{array}{ll} \mathbf{G}_{c c} & \mathbf{G}_{c n} \\ \mathbf{G}_{n c} & \mathbf{G}_{n n} \end{array}\right], \] 这里下标 c 为核心群体，n 为非核心群体。这里的 \(\mathbf{G}\) 经过了 blending 从而确保其为非奇异矩阵，即 \(0.95\mathbf{G}+0.05\mathbf{A}_{22}\)。这里的 \(\mathbf{A}_{22}\) 矩阵使用 (Aguilar, 2011) 中的算法得到，但是只使用其中相对于 \(\mathbf{G}_{cc}\) , \(\mathbf{G}_{cn}\) 和 \(\mathbf{G}_{nn}\) 对角线位置的元素加入到 \(\mathbf{G}\) 阵中。然后对 blending 之后的 \(\mathbf{G}\) 阵进行 scale ，使得其对角线元素矩阵和非对角线元素均值与 \(\mathbf{A}_{22}\) 矩阵相同。

然后我们计算 APY 逆，公式如下 \[ \begin{aligned} & \mathbf{G}_{\mathrm{APY}}^{-1}= {\left[\begin{array}{cc} \mathbf{G}_{c c}^{-1}+\mathbf{G}_{c c}^{-1} \mathbf{G}_{c n} \mathbf{M}_{n n}^{-1} \mathbf{G}_{c n}^{\prime} \mathbf{G}_{c c}^{-1} & -\mathbf{G}_{c c}^{-1} \mathbf{G}_{c n} \mathbf{M}_{n n}^{-1} \\ -\mathbf{M}_{n n}^{-1} \mathbf{G}_{c n}^{\prime} \mathbf{G}_{c c}^{-1} & \mathbf{M}_{n n}^{-1} \end{array}\right] } \\ &= {\left[\begin{array}{cc} \mathbf{G}_{\mathrm{APY}}^{c c} & \mathbf{G}_{\mathrm{APY}}^{c n} \\ \mathbf{G}_{\mathrm{APY}}^{c n} & \mathbf{M}_{n n}^{-1} \end{array}\right] \text { and } } \\ & \mathbf{M}_{n n}=\operatorname{diag}\left\{g_{i i}-\mathbf{g}_{c i}^{\prime} \mathbf{G}_{c c}^{-1} \mathbf{g}_{c i}\right\}, \end{aligned} \] 这里\(g_{ii}\) 是 \(\mathbf{G}_{nn}\) 的第 \(i\) 个对角线元素， \(\mathbf{g}_{ci}\) 是 \(\mathbf{G}_{cn}\) 的第 \(i\) 列，\(\mathbf{M}_{nn}\) 是一个对角线矩阵。

下面是为了实现高效计算需要注意的地方：

基因型需要压缩为二进制格式（如plink的 bed 文件），以节约内存。
我们需要完整计算 \(\mathbf{G}_{cc}\) 和 \(\mathbf{G}_{cn}\)，但是 \(\mathbf{G}_{nn}\) 矩阵只需要计算对角线元素，保存为一个向量。
我们通过 LAPACK (the Linear Algebra Package) 包 (Aguilar, 2011) 对 \(\mathbf{G}_{cc}\) 矩阵进行更新覆盖得到 \(\mathbf{G}_{cc}^{-1}\) ，那么 \(- \mathbf{G}_{cc}^{-1}\mathbf{G}_{cn}\) 会临时存储在内存中，而 \(\mathbf{M}_{nn}^{-1}\) 中的对角线元素会替换 \(\mathbf{G}_{nn}\) 矩阵中的元素。最终 \(\mathbf{G}_{cc}\) 和 \(\mathbf{G}_{cn}\) 会覆盖为 \(\mathbf{G}^{cc}_{APY}\) 和 \(\mathbf{G}^{cn}_{APY}\) 。BLAS (the Basic Linear Algebra Subprograms) 用于密集矩阵操作(Aguilar, 2011) 。

ssGBLUP 同样需要 \(\mathbf{A}_{22}^{-1}\) ，这个矩阵是一个密集矩阵，并且不能保存在内存中。当我们使用 PCG 方法来求解 MME 时，每一轮迭代我们只需要计算 \(\mathbf{A}_{22}^{-1}\) 和一个向量 (称为 \(\mathbf{q}\) ) 的乘积，这个乘积可以通过下面的公式得到（Strandén and Mäntysaari (2014)）： \[ A_{22}^{-1}=A^{22}-\left(A^{12}\right)^{\prime}\left(A^{11}\right)^{-1}\left(A^{12}\right) \]

\[ A_{22}^{-1} q=A^{22} q-\left(A^{12}\right)^{\prime}\left[\left(A^{11}\right)^{-1}\left[A^{12} q\right]\right] \]

其中右手项的所有元素均是 \(\mathbf{A}^{-1}\) 矩阵的子矩阵，即 \[ \mathbf{A}=\left[\begin{array}{ll} \mathbf{A}_{11} & \mathbf{A}_{12} \\ \mathbf{A}_{21} & \mathbf{A}_{22} \end{array}\right] \text { and } \mathbf{A}^{-1}=\left[\begin{array}{cc} \mathbf{A}^{11} & \mathbf{A}^{12} \\ \mathbf{A}^{21} & \mathbf{A}^{22} \end{array}\right] \] 每一次计算 \(\mathbf{A}_{22}^{-1} \mathbf{q}\) 均通过一系列系数矩阵操作得到：\(\mathbf{t = A^{22}q}\), \(\mathbf{x = A^{12}q}\), \(\mathbf{y = (A^{11})^{−1} x}\) , \(\mathbf{z = (A^{12})^{\prime}y}\), \(\mathbf{A_{22}^{−1}q = t − z}\) 。其中 \(\mathbf{t,x,y,z}\) 均为临时向量。

其中 \(\mathbf{y = (A^{11})^{−1} x}\) 可以改为方程组 \(\mathbf{A^{11}y = x}\) ，因此我们需要对 \(\mathbf{A}^{11}\) 矩阵进行 Cholesky 分解，我们可以使用 FSPAK 或 YAMS 来实现这一点。在实际中，因为结果相同，上面的表达式只考虑基因型个体（在矩阵 \(\mathbf{A}_{22}\) 中）和基因型个体的未检测基因型的祖先 （在矩阵 \(\mathbf{A}_{11}\) 中）（很好理解，就相当于对基因型个体追系谱了）。

我们将上面提到的算法应用到 BLUP90IOD2 程序中，其中通过 PCG 方法求解 MME 。这个脚本通过 Intel Fortran Compiler 14.0 进行编译。我们使用 the Intel Math Kernel Library 11.0 中的 BLAS 和 LAPACK 包。

参考文献

Masuda Y, Misztal I, Tsuruta S, et al. Implementation of genomic recursions in single-step genomic best linear unbiased predictor for US Holsteins with a large number of genotyped animals[J]. Journal of Dairy Science, 2016, 99(3): 1968-1974.
Aguilar, I., I. Misztal, A. Legarra, and S. Tsuruta. 2011. Efficient computation of the genomic relationship matrix and other matrices used in single-step evaluation. J. Anim. Breed. Genet. 128:422–428.

版权声明： 本博客所有文章除特别声明外，著作权归作者所有。转载请注明出处！