# TODO

PUT THE FOLLOWING NOTES INTO CORRECT POSTS:

• “evidence framework” and “expectation maximization (EM) algorithm” sole the same problem: maximize marginal (over $\bm{w}$) likelihood function $p(\bm{t}|\bm{X},\alpha,\beta)$. (Bishop, 2006, p166)

# OneWeb

~648 satellites

as early as 2019

proposed by the company WorldVu Satellites Ltd.

The 648 communication satellites will operate in circular low Earth orbit, at approximately 750 miles (1,200 km) altitude,[2] transmitting and receiving in the Ku band of the radio frequency spectrum.

## 进展

1. Most of the capacity of the initial 648 satellites has been sold, and OneWeb is considering nearly quadrupling the size of the satellite constellation by adding 1,972 additional satellites that it has priority rights to.

# Different frames

EME2000 == J2000 (https://en.wikipedia.org/wiki/Earth-centered_inertial)

ICRF == J2000 (for JPL)

Geocentric Celestial Reference Frame (GCRF) is the Earth-centered counterpart of the International Celestial Reference Frame. (https://en.wikipedia.org/wiki/Earth-centered_inertial)

TEME:

http://itrf.ensg.ign.fr/

## geocentric system, rotational, ECEF

ECEF: Earth-Centered, Earth-Fixed

ECR: Earth-Centered Rotational

• == ECEF

ITRS: International Terrestrial Reference System

• Realizations of the ITRS are produced by the IERS ITRS Product Center (ITRS-PC) under the name International Terrestrial Reference Frames (ITRF).

ITRF: International Terrestrial Reference Frame

WGS84: World Geodetic System

• a standard for use in cartography, geodesy, and satellite navigation including GPS. (wikipedia)
• It comprises (wikipedia)
• a standard coordinate system for the Earth,
• a standard spheroidal reference surface (the datum or reference ellipsoid) for raw altitude data,
• a gravitational equipotential surface (the geoid) that defines the nominal sea level.

BCRS: barycentric celestial reference system (https://en.wikipedia.org/wiki/Barycentric_celestial_reference_system)

• The focus of the BCRS is on astronomy: exploration of the Solar System and the universe.

## geocentric system, inertial

IERS: International Earth Rotation and Reference Systems Service.

• maintained ITRS and ITRF solutions

GCRS: Geocentric Celestial Reference System (https://en.wikipedia.org/wiki/Barycentric_celestial_reference_system)

• BCRS centered at Earth
• The focus of the GCRS is somewhat more on the navigation of Earth satellites and the geophysical applications they support. The proper functioning of the Global Positioning System (GPS) is directly dependent upon the accuracy of satellite measurements as supported by the GCRS.

GCRF: Geocentric Celestial Reference Frame

## centered at the barycenter of the Solar System

ICRS: International Celestial Reference System

ICRF: International Celestial Reference Frame

## others

Most excerpted from Orekit Frames: (https://www.orekit.org/site-orekit-9.2/architecture/frames.html)

MOD: Mean Of Date frame = J2000 + precession evolution

TOD: True Of Date frame = J2000 + precession evolution + nutation

GTOD: Greenwich True Of Date frame = J2000 + precession evolution + Greenwich Apparent Sidereal Time

GCRF --> EME2000

• rotations along three axis, very tiny angles (from Orekit source code, EME2000Provider())

## softwares

There are two software libraries of IAU-sanctioned algorithms for manipulating and transforming among the BCRS and other reference systems:

• the Standards of Fundamental Astronomy (SOFA) system
• the Naval Observatory Vector Astrometry Subroutines (NOVAS).

IERS: International Earth Rotation and Reference Systems Service

A Terrestrial Reference frame provides a set of coordinates of some points located on the Earth’s surface

GCRS = GCRF = ICRF = ICRS: small difference with EME2000 所以，skyfield得到的结果，应该不需要转换到EME2000中，可以直接在GCRF下使用 可以测试orekit中两个坐标系相差多少 https://space.stackexchange.com/questions/26259/what-is-the-difference-between-gcrs-and-j2000-frames

EOP: Earth Orientation Parameters

# (20130903) EOS Aura and Shijian (SJ)-11-02 satellite conjunction ppt (saved)web

On September 3, 2013, predicted close approach

Second in a series of SJ-11 satellites launched by China into an orbit very similar to that of the Morning and Afternoon Constellations.

Aura flight controllers prepared a RMM to avoid the close approach. They did not know whether SJ-11-02 was capable of maneuvering.

A request was sent through the US State Department to its Chinese counterpart to let their space agency know of NASA’s planned maneuver. A request was sent through the US State Department to its counterpart in China to let their space agency know of NASA’s planned maneuver, but there was no direct two-way coordination.

Both satellites maneuvered within hours of each other. Fortunately, the 2 maneuvers mitigated the risk.

Example of the need to improve communication with non-constellation satellites.

# (20140405) Sentinel 1A - ACRIMSAT close approach immediately after launch

almosallam_heteroscedastic_2017

Heteroscedastic Gaussian processes for uncertain and incomplete data

Ibrahim Almosallam

PhD Thesis, University of Oxford, https://ora.ox.ac.uk/objects/uuid:6a3b600d-5759-456a-b785-5f89cf4ede6d

Estimation theory

# 数学概念

$\mathcal{L}(\theta \mid x) = f(x\mid\theta)$

$f(x|\theta)$ 作为 $x$ 的函数时，是概率密度函数； 当 $f(x|\theta)$ 作为 $\theta$ 的函数时，是似然函数。

$f(\theta \mid x) = \frac{f(x\mid \theta )\,g(\theta )}{\int _{\Theta }f(x\mid \vartheta )\,g(\vartheta )\,d\vartheta}$

$\hat{\theta}_{\rm ML}(x) = \arg \max_{\theta} \mathcal{L}(\theta|x) = \arg \max_{\theta} f(x|\theta)$

$\hat{\theta}_{\rm MAP}(x) = \arg \max_{\theta}{\frac{f(x\mid\theta )\,g(\theta )}{\int_{\Theta}f(x\mid\theta ')\,g(\theta ')\,d\theta '}}=\arg \max_{\theta}f(x|\theta )\,g(\theta)$

$g(\theta)$ 是假设的先验概率；分母与 $\theta$ 的取值无关；$g(\theta)$ 为均匀分布时与 $\hat\theta_{\rm ML}$ 等价。

# 算法

（未完成）

$-\frac{1}{2} \bm{\beta} \sigma^2 - \frac{1}{2} \bm{B} ( \Phi \circ (\Phi\Sigma^{-1} ) \bm{1}_m + \frac{1}{2} \cdot \bm{1}_m$

## 未读

Moment matching algorithm

# 学习资料

Jorbe, 变分法和变分贝叶斯推断 （感觉文章最后缺少一些内容，为什么要得到 $P(X,Z)$ ？ 以及得到后怎么使用，下一步操作是什么？

Gal, Yarin, “Uncertainty in Deep Learning,” Doctor of Philosophy, University of Cambridge, 2016.

# 全文的主要贡献

(p15) We will thus concentrate on the development of practical techniques to obtain model confidence in deep learning, techniques which are also well rooted within the theoretical foundations of probability theory and Bayesian modelling. Specifically, we will make use of stochastic regularisation techniques (SRTs).

These techniques adapt the model output stochastically as a way of model regularisation (hence the name stochastic regularisation). This results in the loss becoming a random quantity, which is optimised using tools from the stochastic non-convex optimisation literature. Popular SRTs include dropout [Hinton et al., 2012], multiplicative Gaussian noise [Srivastava et al., 2014], dropConnect [Wan et al., 2013], and countless other recent techniques4,5.

# 作者对 NN 的一些讨论

## CNN

Convolutional neural networks (CNNs). CNNs [LeCun et al., 1989; Rumelhart et al., 1985] are popular deep learning tools for image processing, which can solve tasks that until recently were considered to lie beyond our reach [Krizhevsky et al., 2012; Szegedy et al., 2014]. The model is made of a recursive application of convolution and pooling layers, followed by inner product layers at the end of the network (simple NNs as described above). A convolution layer is a linear transformation that preserves spatial information in the input image (depicted in figure 1.1). Pooling layers simply take the output of a convolution layer and reduce its dimensionality (by taking the maximum of each (2, 2) block of pixels for example). The convolution layer will be explained in more detail in section §3.4.1.

## RNN

Recurrent neural networks (RNNs). RNNs [Rumelhart et al., 1985; Werbos, 1988] are sequence-based models of key importance for natural language understanding, language generation, video processing, and many other tasks [Kalchbrenner and Blunsom, 2013; Mikolov et al., 2010; Sundermeyer et al., 2012; Sutskever et al., 2014].

## PILCO

PILCO [Deisenroth and Rasmussen, 2011], for example, is a data-efficient probabilistic model-based policy search algorithm. PILCO analytically propagates uncertain state distributions through a Gaussian process dynamics model. This is done by recursively feeding the output state distribution (output uncertainty) of one time step as the input state distribution (input uncertainty) of the next time step, until a fixed time horizon T.

## 与 GP 的关系

(p14) Even though modern deep learning models used in practice do not capture model confidence, they are closely related to a family of probabilistic models which induce probability distributions over functions: the Gaussian process. Given a neural network, by placing a probability distribution over each weight (a standard normal distribution for example), a Gaussian process can be recovered in the limit of infinitely many weights (see Neal [1995] or Williams [1997]). For a finite number of weights, model uncertainty can still be obtained by placing distributions over the weights—these models are called Bayesian neural networks.

# Monte Carlo integration

Assuming $\theta_i$ is sampled from the distribution $p(\theta|D)$, the Monte Carlo integreation formula is:

$\mathbb{E}_{\theta\sim p(\theta|D)}[g(\theta)] = \int g(\theta) p(\theta|D) d\theta \approx \frac{1}{n} \sum_{\theta_i\sim p(\theta|D)} g(\theta_i) + O(\sqrt{n})$

The following discussion provides a very clear interpretation about Bayes Inferencing, but I’m not sure it’s exact and 100% correct. Need to do more reading.
Can a posterior expectation be used as a approximate for the true (prior) expectation?

# 学习资料

[ref-2] daniel-D, 从随机过程到马尔科夫链蒙特卡洛方法 （不太好，讲得比较混乱）

[ref-3] 靳志辉, LDA-math-MCMC 和 Gibbs Sampling （我从这里开始仔细看算法，细致平稳条件）

[ref-4] shenxiaolu1984, 蒙特卡洛-马尔科夫链(MCMC)初步 （简要介绍了4种采用方法，具体算法的公式挂了）

[ref-7] 随机模拟-Monte Carlo积分及采样（详述直接采样、接受-拒绝采样、重要性采样） （讲了 Monte Carlo 积分与几种常见的采样方式的解释比较直观和深刻。MCMC 的主要作用之一是用来 支持 Monte Carlo 积分，其中涉及到了对某概率 $f(x)$ 的采样。）

[ref-8] Bin的专栏, 随机采样方法整理与讲解（MCMC、Gibbs Sampling等） （推荐。基本是最正确的理解顺序。）

[ref-9] 再谈MCMC方法

[ref-wiki-MCMC] Markov chain Monte Carlo（未看）

[ref-wiki-Gibbs]

# 我的整理

• 随机过程
• Markov 性，无后效性
• Markov Chain 的极限和平稳分布
• 概率分布的采样，数值方法

steinwart_support_2008

## Statistical Learning Theory 的本质

• assuming that the output value y to a given x is stochastically generated by P( · |x) accommodates the fact that in general the information contained in x may not be sufficient to determine a single response in a deterministic manner.
• assuming that the conditional probability P( · |x) is unknown contributes to the fact that we assume that we do not have a reasonable description of the relationship between the input and output values.

## SVM 和 GP 的关系

For a brief description of kernel ridge regression and Gaussian processes, see Cristianini and Shawe-Taylor (2000, Section 6.2).

We refer to Wahba (1999) for the relationship between SVMs and Gaussian processes.