# Online Meta Learning - Slides of My Presentation @ NTU

This is my slides in group meeting presenting paper Online Meta-Learning [1].

Reference: Finn’s oral on ICML 2019 [2].

## Reference

1. 1.Chelsea Finn, Aravind Rajeswaran, Sham Kakade and Sergey Levine. Online Meta-Learning. ICML 2019.
2. 2.Chelsea Finn, Aravind Rajeswaran, Sham Kakade and Sergey Levine. Online Meta-Learning Oral Presentation. ICML 2019.

# A Survey on Meta-Learning - Slides of My Presentation @ NTU

Two weeks after I joined NTU, I presented a survey at our group meeting. Here, I share the slides to all.

Reference: Lil’Log [1], Finn’s tutorial on ICML 2019 [2].

# ECCV 2018 Person Re-Identification Paper Reading

## Pose-Normalized Generation for Person Re-Identification

[1]

Overview: The author tries to use GAN that takes pose information to transfer (generate from) a real person image to new image with targeted pose. Thus, with the inputs regulated to the same pose, the feature extractor can learn pose-invariant features.

Motivation:

• Make the learned features pose-insensitive. Since the inputs are of the same pose, the feature extractor can learn discriminative features despite the poses.
• Use eight canonical poses obtained by K-means for specific dataset to make the normalized pose more representative.

Model:

• A GAN to generate pose specific image with the same identity as the input. The loss function for generator:
$$\mathcal{L}_{G_p} = \mathcal{L}_{GAN} + \lambda_1 \cdot \mathcal{L}_{L_1},$$
and
$$\mathcal{L}_{GAN} = \mathbb{E}_{\mathbf{I}_j \sim p_{data}( \mathbf{I}_j )}\{log{ D_p(\mathbf{I}_j) + log( 1 - D_p( G_p(\mathbf{I}_j, \mathbf{I}_{P_j}) ) ) }\}$$
$$\mathcal{L}_{L_1} = \mathbb{E}_{\mathbf{I}_j \sim p_{data}( \mathbf{I}_j )}\{[\| \mathbf{I}_j - \mathbf{\hat I}_j \|_1]\}$$
The loss of discriminator is the same as conventional ones.
• Use K-means algorithm to get the 8 most representative poses, and then train another network to handle person image in these 8 normalized poses.
• When evaluation, use the fusion of nine features. The fusion is to perform element-wise max among these features.

Question:

• No explicit loss for discriminator to judge the generated pose. The discriminator just takes the generated image and tell if the image is fake or real (from dataset), which means the generator can just ignore the input pose information and learn as regular generator in GANs.
• Due to the $\mathcal{L}_{L_1}$ in the loss of generator, the generator can be optimized just generate the same image as the input. And as mentioned in the above item, the pose information can be ignored.
• Since there are two base networks, and at the end the two features are fused, how to guarantee the two learned features have the same semantic meaning?
• More evaluations on the components of the model is need. Also more images generated from the PN-GAN.

## Reference

1. 1.X. Qian, Y. Fu, T. Xiang, W. Wang, J. Qiu, Y. Wu, Y. G. Jiang, X. Xue. Pose-Normalized Image Generation for Person Re-identification. ECCV 2018.

Adversarial Open-World Person Re-Identification is the my first paper as the first-author during my undergraduate life. The acceptance to ECCV 2018 makes it more exciting and meaningful to me.

Actually it is this work that helps me have a better understanding towards real research, including discussing about it over and over again, doing experiments over and over again and polishing it over and over again until the DDL of paper submission, which gives me the opportunity to see what’s like in Guangzhou at 4:00 am. Prof. Zheng and PhD. Wu helped me a lot, and I really appreciate it.

The paper is available at Arxiv.

## How did I get this idea?

In open-world person re-identification (hereinafter called re-id) scenario, the major issue is to distinguish target people from non-target ones. And obviously non-target people who look like (or are close to in feature space) target people are a main threat to our re-id system. A simple solution is that we can find target-like non-target people in the dataset and alert the network about this by designing a loss to push them away. But you can never get enough such samples in a fixed dataset, so here comes the power of Generative Adversarial Networks (GANs) [1].

This work was initially inspired by Zheng’s paper Unlabeled samples generated by gan improve the person re-identification baseline in vitro [2] when Prof. Zheng talked with me saying that maybe we can do something with GAN in open-world person re-id. And Zheng’s work was one of the mere works about applying GAN in person re-id at that time. However Zheng’s work is very limited, or the using of GAN is very limited. Leaving apart the loss design, in my understanding, GAN in this work is just used to generate unlabeled samples to augment the dataset. They trained the GAN in the first place, and used it to get more samples to form a larger dataset. And here is where the job of GAN ends. The following processes are all about how to make use of the enlarged dataset, which contains some unlabeled samples.

But I demanded more. I was thinking about how to make GAN part of our end-to-end learning process. Cause in this way, the generated samples are not fixed but becoming more and more efficient since the weights of GAN are changing during this end-to-end learning. In this case, since we use GAN to generate adversarial samples to attack the feature extraction network, with the stronger (robuster) the feature extractor becomes, the GAN learns something new and gets stronger (can produce more efficient samples) too, thus better feature extractor can be obtained by using better imposter samples generated by GAN.

## The essence of APN: weight sharing

How can we make the GAN become better with the improvement of the feature extractor? And it would be better if the GAN can always know the instant weakness of the feature extractor. By doing this, the samples generated by GAN are always targeting to attack the extractor. The answer is letting the feature extractor become part of the GAN.

In APN, we modified the GAN, and use two distinct discriminators to provide guidance to the generator, while one of the discriminator, the one used to distinguish target people from non-target ones, shares the same weights with the feature extractor. The benefit is obvious because now the generator is guided by the feature extractor’s ability of discriminating targets and non-targets. So the generated samples are very efficient and specifically targeting to attack the growing discriminating ability of feature extractor. Our goal of letting the feature extractor become robuster to similar target and non-target people is achieved. In every batch, the feature extractor becomes robuster and the generator also becomes more efficient.

## Conclusion

In this post, I just talked about the basic idea and my perspective about our work Adversarial Open-World Person Re-Identification, which is coarse and not detailed. For more information, it’s better to explore the paper yourself. :)

## Reference

1. 1.I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville and Y. Bengio. Generative Adversarial Networks. NIPS 2014.
2. 2.Z. Zheng, L. Zheng and Y. Yang. Unlabeled Samples Generated by GAN Improve the Person Re-Identification Baseline in Vitro. CoRR 2017.

# 关于CNN（卷积神经网络）的理解

AI的迅速火爆与CNN（卷积神经网络）的提出有很大关系。我最早在杂志上面看到一篇关于神经网络的科普文，当时还没有从事这方面的研究，认为可以用计算机模拟人的神经了，太牛逼了。现在看来还远远达不到生物的神经水平，而神经网络仅仅是参照生物神经激活的一种高维非线性表达方式。

## 从CNN的组成出发

CNN虽然很早就提出来了，但是真正让CNN火起来的是其在图像分类上面取得的巨大成功，而这个成功自然离不开Hinton在2012年的[1]，直接在ILSVRC-2012比赛中图像分类正确率超过了第二名26.2%，可谓效果相当惊人。[1]中给出的一种网络结构AlexNet也可谓非常经典。

### 卷积层

#### 缩放/旋转不变性

Spacial Transformer Networks

### 激活层

#### 非线性

$$f(\vec{x}) = f_N(f_{N-1}(…f_1(\vec{x}))) \\ = W^N(W^{N-1}(…W^1(\vec{x})+\vec{b}^1)+\vec{b}^{N-1})+\vec{b}^N \\ = W^NW^{N-1}…W^1\vec{x} + W^NW^{N-1}…W^2\vec{b}^1 + … + W^N\vec{b}^{N-1}+\vec{b}^N$$

Sigmoid导数的函数图像

LeakyReLU函数图像

### 池化层

#### GAP

VGG16网络各层的参数个数

Global Average Pooling

### BN层

Batch Normalization的思想与Hinton在训练深度信念网络（Deep Belief Networks[14]）时用的方法是有一定的联系的。在Deep Belief Networks的训练中，先逐层训练，即每次只训练一个层，这样可以避免一次训练很多个层的时候从后面层传回来的梯度逐渐消失的情况。 而Batch Normalization则是把每一层的输入给归一化，类似于前面Hinton的把层间的联系区分开，假设层间数据分布不变，来解决梯度消失问题。

## 网络的结构

### ResNet

#### Identity Mapping

ResNet的提出是令人兴奋的，因为它直接解决了深层网络难以训练的问题。它的形式很优美$G(x) = F(x) + x$。

Residual Block

[6]提出的假设

[15]中修改版的ResNet

### Inception

Google的Inception也是非常流行的网络结构，它进行了对网络宽度的探索。但是我始终觉得没有ResNet那样漂亮简洁。最初的Inception[16]中提出了在同一层分别使用$1 \times 1$、$3 \times 3$、$5 \times 5$的卷积核进行卷积，另外还有一个max pooling，然后结果堆叠在一起。这样有什么好处呢，我们之前说到卷积核的大小其实代表了这一层的感受野，能够感知上一层多大的区域，那么Inception结构就是希望在不确定应该感知多大的区域的时候，干脆就全都感知了，把这些特征一起拿来用。

## 参考文献

1. 1.A. Krizhevsky, I. Sutskever, and G. E. Hinton. ImageNet Classification with Deep Convolutional Neural Networks. NIPS 2012.
2. 2.X. Han, Y. Zhong, L. Cao, and L. Zhang. Pre-Trained AlexNet Architecture with Pyramid Pooling and Supervision for High Spatial Resolution Remote Sensing Image Scene Classification. MDPI 2017.
3. 3.S. Ioffe and C.Szegedy. Batch Normalization. ICML 2015.
4. 4.I. Goodfellow, Y. Bengio and A. Courville. Deep Learning.
5. 5.M. Jaderberg, K. Simonyan, A. Zisserman and K. Kavukcuoglu. Spatial Transformer Networks. NIPS 2015.
6. 6.K. He, X. Zhang, S. Ren and J. Sun. Deep Residual Learning for Image Recognition. CVPR 2015.
7. 7.G. Huang, Z. Liu and L.Maaten. Densely Connected Convolutional Networks. CVPR 2017.
8. 8.A. G. Howard, M. Zhu, B. Chen and D.Kalenichenko. MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861.
9. 9.知乎 留德华叫兽 的回答。https://www.zhihu.com/question/38098038
10. 10.G. Ian, S. Jonathon and S. Christian. Explaining and Harnessing Adversarial Examples. arXiv preprint arXiv:1412.6572.
11. 11.A. L. Mass, A. Y. Hannun and A. Y. Ng. Rectifier Nonlinearities Improve Neural Network Acoustic Models. ICML 2013.
12. 12.K. He, X. Zhang, S. Ren and J. Sun. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. IEEE international conference on computer vision 2015.
13. 13.L. Min, C. Qiang and Y. Shuicheng. Network in network. arXiv preprint arXiv:1312.4400.
14. 14.G. E. Hinton. Deep Belief Networks. Scholarpedia 2009.
15. 15.K. He, X. Zhang, S. Ren and J. Sun. Identity Mappings in Deep Residual Networks. ECCV 2016.
16. 16.C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke and A. Rabinovich. Going deeper with convolutions. CVPR 2015.

# 通过Nginx + Uwsgi + Flask来搭建你的小型博客

————————————————————————————————

Nginx设置

（图片来自网络）

————————————————————————————————

uwsgi配置

# 中山大学本科教务系统公选课网络爬虫实践

—————————————————————————-

—————————————————————————-

—————————————————————————-