My conclusions

Zotero can do everything Mendeley could, even more elegantly. Simple DO NOT use Mendeley.

Sadly the development of Docear stopped, so don’t use it

Use tags to organize papers, see below for reasons. Leave collection just literally a collection of something.

Set up your own rules ahead and then follow them strictly. Keep polishing your rules as needed.

Working flow

Collecting efficiently

Add a new collection, search and add using the Zotero connector plugin for Chrome.

Download paper using sci-hub-fy plugin, if necessary.

Make sure meta-data are correct at the very beginning.

Files should be managed by ZotFile, saved in Dropbox.

Reading on tablet

ZotFile + Dropbox + Book Note+ or other tablets.
Helps to concentrate on papers.

Read more »

Problem descriptions and definitions

Conventional or practical methods

Covariance Based Track Association (CBTA)

Geometrical approach

Xiangxu Lei, Kunpeng Wang, Pin Zhang, Teng Pan, Huaifeng Li, Jizhang Sang, and Donglei He, “A geometrical approach to association of space-based very short-arc LEO tracks”, Advances in Space Research, vol. 62, Aug. 2018, pp. 542–553.

space-based angles-only very short-arc (SBVSA) LEO tracks

Michalis K. Titsias, Magnus Rattray, and Neil D. Lawrence, “Markov chain Monte Carlo algorithms for Gaussian processes,” Bayesian Time Series Models, David Barber, A. Taylan Cemgil, and Silvia Chiappa, eds., Cambridge: Cambridge University Press, 2011, pp. 295–316. [Link].

Estimate latent function



yi=fi+ϵiy_i = f_i + \epsilon_i

Joint distribution is

p(y,f)=p(yf)p(f)p(\bm{y},\bm{f}) = p(\bm{y}|\bm{f}) p(\bm{f})

Applying Bayes’ rule and posterior over f\bm{f} is

p(fy)=p(yf)p(f)p(yf)p(f)dfp(\bm{f}|\bm{y}) = \frac{p(\bm{y}|\bm{f})p(\bm{f})}{\int p(\bm{y}|\bm{f})p(\bm{f})\,{\rm d}\bm{f}}

Predict the function value f\bm{f}_* at an unseen inputs X\bm{X}_*

p(fy)=p(ff)p(fy)df\textcolor{blue}{p(\bm{f}_*|\bm{y})} = \int p(\bm{f}_*|\bm{f}) p(\bm{f}|\bm{y})\,{\rm d}\bm{f}

where p(ff)p(\bm{f}_*|\bm{f}) is the conditional GP prior given by,

p(ff)=N(f,)p(\bm{f}_*|\bm{f}) = \mathcal{N}(\bm{f}_*|\circ,\circ)

Predict y\bm{y}_* corresponding to f\bm{f}_* is

p(yy)=p(yf)p(fy)df\textcolor{red}{p(\bm{y}_*|\bm{y})} = \int p(\bm{y}_*|\bm{f}_*) \textcolor{blue}{p(\bm{f}_*|\bm{y})} \,{\rm d}\bm{f}_*

In a mainstream machine learning application involving large datasets and where fast inference is required, deterministic methods are usually preferred simply because they are faster.
In contrast, in applications related to scientific questions that need to be carefully addressed by carrying out a statistical data analysis, MCMC is preferred.

Rasmussen, Carl Edward, and Christopher K. I. Williams. 2006. Gaussian Processes for Machine Learning. Adaptive Computation and Machine Learning. Cambridge, Mass: MIT Press. http://www.gaussianprocess.org/gpml/chapters/.


Sec 2讲了做regression的几乎所有基础理论。

Sec 3讲做classification,没有看。

Sec 4讲不同covairance的性质,未看,待看。

在Sec 5讲模型的训练理论。
这本书里把通常的机器学习中的训练的概念称为model selection,所以作为一个外行花了很长时间才弄明白这部分是讲如何训练的。

Bayesian inference

Read more »

Mark E Pittelkau, “Survey of Calibration Algorithms for Spacecraft Attitude Sensors and Gyros”, Advances in the Astronautical Sciences, vol. 129, 2007, pp. 1–55.

1. Introduction

The purpose of this paper is to present an overview of the various calibration algorithms, to examine their merits, and to show where and how they have been applied.

This survey extends back to 1969, although there were some relatively minor developments before that time.

This survey focuses mainly on methods rather than applications.

A critical review of the literature is provided, including strengths and weaknesses of algorithms and an assessment of results and conclusions in the literature.

Read more »

Jonathan Ko, “Gaussian Process for Dynamic Systems”, PhD Thesis, University of Washington, 2011.

Bayes filter equation in Eq. 4.1 (p.34) has a typo (should be \propto, not ==)

p(xtz1:t,u1:t1)p(ztxt)p(xtxt1,ut1)p(xt1z1:t1,u1:t2)dxt1p(x_t|z_{1:t},u_{1:t-1}) \propto p(z_t|x_t) \int \textcolor{red}{p(x_t|x_{t-1},u_{t-1})} \textcolor{green}{p(x_{t-1}|z_{1:t-1},u_{1:t-2})} dx_{t-1}

  • Red\textcolor{red}{Red} part is dynamics model, describing how the state xx evolves in time based on the control input uu (p.34)
  • Green\textcolor{green}{Green} part is observation model, describing the likelihood of making an observation zz given the state xx
  • GP-BayesFilter improves these two parts.

The dynamics model maps the state and control (xt,ut)(x_t,u_t) to the state transition Δxt=xt+1xt\Delta x_t = x_{t+1} - x_t.
So, the training data is

Dp=<(X,U),X>D_p = <(X,U),X'>

The observation model maps from the state xtx_t to the observation ztz_t.
So, the training data is

Do=<X,Z>D_o = <X,Z>

The resulting GP dynamics and observation models are (p.44)

p(xtxt1,ut1)N(GPμ([xt1,ut1],Dp),GPΣ([xt1,ut1],Dp))p(x_t|x_{t-1},u_{t-1}) \approx \mathcal{N}(\text{GP}_\mu([x_{t-1},u_{t-1}],D_p), \text{GP}_\Sigma([x_{t-1},u_{t-1}],D_p))


p(ztxt)N(GPμ(xt,Do),GPΣ(xt,Do))p(z_t|x_t) \approx \mathcal{N}(\text{GP}_\mu(x_t,D_o), \text{GP}_\Sigma(x_t,D_o))

Read more »

If you are looking at this post, it means you are also pretty much a newbie to TensorFlow, like me, as of 2020-07-29.


Keras is already part of TensorFlow, so, use from tensorflow.keras import ***, not from keras import ***.

TensorFlow backend

Early stopping


model.fit(..., callbacks=[EarlyStopping(monitor='val_loss', patience=5, verbose=1, mode='min', restore_best_weights=True)], ...)

Reproducibility of results

Set all random seeds
Use tensorflow.keras instead standalone keras
Use model.predict_on_batch(x).numpy() for predicting speed.

Read more »