My conclusion:

Simple DO NOT use Mendeley.
Zotero can do everything Mendeley could, even more elegantly.

Sadly the development of Docear stops.

Read more »

Learning Materials

  • 一文看懂AutoEncoder模型演进图谱

    • “深度推荐系统”
    • AutoEncoder; Denoising AutoEncoder; Sparse AutoEncoder; CNN/LSTM AutoEncoder;
    • 生成模型引入AutoEncoder --> Variational AutoEncoder (VAE是一个里程碑式的研究成果,倒不是因为他是一个效果多幺好的生成模型,主要是提供了一个结合概率图的思路来增强模型的鲁棒性。后续有很多基于VAE的扩展,包括infoVAE、betaVAE和factorVAE等。)
    • 将GAN的思路引入AutoEncoder --> Adversarial AutoEncoder
  • 自动编码器:各种各样的自动编码器

    • https://blog.keras.io/building-autoencoders-in-keras.html
    • 通常情况下,使用自编码器做数据压缩,性能并不怎么样。
    • 自编码器在实际应用中用的很少,2012年人们发现在卷积神经网络中使用自编码器做逐层预训练可以训练深度网络
    • 到了2015年底,通过使用残差学习(ResNet)我们基本上可以训练任意深度的神经网络。(查这个用ResNet的训练方法)
    • 自编码器并不是一个真正的无监督学习的算法,而是一个自监督的算法
    • 如果你的输入是序列而不是2D的图像,那么你可能想要使用针对序列的模型构造自编码器,如LSTM。要构造基于LSTM的自编码器,首先我们需要一个LSTM的编码器来将输入序列变为一个向量,然后将这个向量重复N此,然后用LSTM的解码器将这个N步的时间序列变为目标序列。

    • 自监督学习是监督学习的一个实例,其标签产生自输入数据。要获得一个自监督的模型,你需要想出一个靠谱的目标跟一个损失函数,问题来了,仅仅把目标设定为重构输入可能不是正确的选项(为什么?)

    • 基本上,要求模型在像素级上精确重构输入不是机器学习的兴趣所在,学习到高级的抽象特征才是。
    • 事实上,当你的主要任务是分类、定位之类的任务时,那些对这类任务而言的最好的特征基本上都是重构输入时的最差的那种特征。
  • MATLAB: Train Stacked Autoencoders for Image Classification

    • This example showed how to train a stacked neural network to classify digits in images using autoencoders.
  • 利用Autoencoder进行无监督异常检测

    • 直接使用还原误差 (reconstruction error) 的 mean absolute error (MAE) 作为判断异常的标准,不用再接一个分类网络
    • 参考是了这里

VAE

https://zhuanlan.zhihu.com/p/34998569

  • 它本质上就是在我们常规的自编码器的基础上,对 encoder 的结果(在VAE中对应着计算均值的网络)加上了“高斯噪声”,使得结果 decoder 能够对噪声有鲁棒性;而那个额外的 KL loss(目的是让均值为 0,方差为 1),事实上就是相当于对 encoder 的一个正则项,希望 encoder 出来的东西均有零均值。
  • 说白了,重构的过程是希望没噪声的,而 KL loss 则希望有高斯噪声的,两者是对立的。所以,VAE 跟 GAN 一样,内部其实是包含了一个对抗的过程,只不过它们两者是混合起来,共同进化的。
  • 一句话,VAE 的名字中“变分”,是因为它的推导过程用到了 KL 散度及其性质。
Read more »

Mark E Pittelkau, “Survey of Calibration Algorithms for Spacecraft Attitude Sensors and Gyros”, Advances in the Astronautical Sciences, vol. 129, 2007, pp. 1–55.

1. Introduction

The purpose of this paper is to present an overview of the various calibration algorithms, to examine their merits, and to show where and how they have been applied.

This survey extends back to 1969, although there were some relatively minor developments before that time.

This survey focuses mainly on methods rather than applications.

A critical review of the literature is provided, including strengths and weaknesses of algorithms and an assessment of results and conclusions in the literature.

Read more »

In this post I summarize everything about GP. An individual post will be organized when I felt that I got enough materials, insights, and thoughts.

At the end, I attach all other individual posts about GP or having mentioned GP.

Read more »

Jonathan Ko, “Gaussian Process for Dynamic Systems”, PhD Thesis, University of Washington, 2011.

Bayes filter equation in Eq. 4.1 (p.34) has a typo (should be \propto, not ==)

p(xtz1:t,u1:t1)p(ztxt)p(xtxt1,ut1)p(xt1z1:t1,u1:t2)dxt1p(x_t|z_{1:t},u_{1:t-1}) \propto p(z_t|x_t) \int \textcolor{red}{p(x_t|x_{t-1},u_{t-1})} \textcolor{green}{p(x_{t-1}|z_{1:t-1},u_{1:t-2})} dx_{t-1}

  • Red\textcolor{red}{Red} part is dynamics model, describing how the state xx evolves in time based on the control input uu (p.34)
  • Green\textcolor{green}{Green} part is observation model, describing the likelihood of making an observation zz given the state xx
  • GP-BayesFilter improves these two parts.

The dynamics model maps the state and control (xt,ut)(x_t,u_t) to the state transition Δxt=xt+1xt\Delta x_t = x_{t+1} - x_t. So, the training data is

Dp=<(X,U),X>D_p = <(X,U),X'>

The observation model maps from the state xtx_t to the observation ztz_t. So, the training data is

Do=<X,Z>D_o = <X,Z>

The resulting GP dynamics and observation models are (p.44)

p(xtxt1,ut1)N(GPμ([xt1,ut1],Dp),GPΣ([xt1,ut1],Dp))p(x_t|x_{t-1},u_{t-1}) \approx \mathcal{N}(\text{GP}_\mu([x_{t-1},u_{t-1}],D_p), \text{GP}_\Sigma([x_{t-1},u_{t-1}],D_p))

and

p(ztxt)N(GPμ(xt,Do),GPΣ(xt,Do))p(z_t|x_t) \approx \mathcal{N}(\text{GP}_\mu(x_t,D_o), \text{GP}_\Sigma(x_t,D_o))