2015 conclusion

2015总结--生活

2015年不知不觉已然过去,回想过去的一年,除了一些重大的事情,其它记忆都是模糊的。幸好有喜欢拍照和发朋友圈的习惯,翻开他们,记忆像洪水涌来。

我能想到的我的未来的生活是这样的:陪伴家人+旅行!我也一直在向这条路上前进!

家人

去年,对于我家来说最重大的事情就是:我哥终于有人收留啦!!!!对于一个36岁的老男人来说,相当不容易。嫂子也有身孕,baby即将到来,基本上我哥算是老来得子,可喜可贺!

Alt text

我能想到最浪漫的事就是陪你们一起慢慢变老。

Alt text

旅行

一直都这么想:历经千辛万苦才来到这个世界,如果没有将它玩遍,待我回到另一个时空,如何说起这个世界的故事?还好我不是一个重度“依赖床“患者,平时喜欢瞎溜达。去年也是去了不少地方!

三月 --富春江徒步;胥岭油菜花

Alt text

四月 --徽清古道;纪龙探洞,速降,攀岩

Alt text

五月 --渔山岛

Alt text

六月-八月--香港

Alt text

Alt text

十月--吴越古道;泰国行

Alt text

Alt text

独处

最难的是,一个人不会觉得无趣,并且能不将就,认真照顾自己。用对待客人的热情对待自己!!

只要有光,你就不是孤独一个人,还有一个最好的伙伴与你如影随形!!

Alt text

烹饪

厨师也是我向往的职业。也许干个几年程序猿就转行做厨子了:)

Alt text

运动

Alt text

阅读

Alt text

未来的自己

职业达人?or 贤妻良母?

Alt text

VS

Alt text

独立而不孤独--独处

独处

最近的生活都是一个人,一个人吃饭,一个人爬山,一个人看电影。。。但是,并没有感觉到任何的无趣与孤单。我相信,已经在不知不觉中学会和自己独处了。

长大之后,最害怕的就是孤单,这于女生来说更甚。约伴吃饭,约伴逛街,甚至约伴上厕所。一旦找不到同伴,宁愿不去,也不会选择一个人。从前的我亦是如此。师姐说,你太依赖别人了。

常常看到小猫,小狗能追着尾巴自娱自乐很长时间;回想小时候,一块泥巴也能让我独自玩上一个下午,出现无聊情绪的时间极少。现在,一旦独自一人,无聊时常会漫上心头。尤其放假在家,如果不出去见朋友,每天很难kill time。无聊就会让人焦躁,想做点事,但是又没心情。这应当是成人独有的心绪吧。

只身一人也有只身一人的好处。可以更好得思考。面对的是赤裸裸的内心世界,没有任何伪装,或者单纯,或者猥琐,或者高尚。可以和自己对话,和别人看不到的那个小人对话。他告诉你,你是怎样一个人,你有哪些需要弥补,哪些需要提升。你可以脱离肉体,站在一个高度审视自己。认识自己,是最为困难的一件事。与自己独处时,便是一个好契机去认识那个在外人面前不一样的自己。

看书也是一件在独处中令人愉悦的事情。文字的功效比想象的大得多。看完一段故事,再细细品味,总是能在自己身上找到跟别人相似的东西。

运动同样带来快感。感受身体的变化,感受肌肉的增长,感受淋漓的大汗。

我的内心世界正在这段时间慢慢强大,对自己也更加信任。如果我能在一个人时,完成很多本来以为需要别人陪伴才能完成的事而不觉得悲苦,未来,还有什么是我不能做到的呢?

我希望的自己是一个内心一直处于平静祥和的人,无论身处什么环境,都不觉得无趣。而目前,我正在修缮那颗心,使它慢慢靠近我要的样子。我相信,会做到的!!!

gang leader for one day

A Year of Books 之一:《黑帮老大的一天》

前两天看完了《黑帮老大的一天》。从书名看像是一本小说,实则是一本社会学著作。下面从几个方面介绍下这本书以及读后感。

作者介绍:

作者Sudhir Venkatesh是哥伦比亚大学社会学教授。在芝加哥大学攻读社会学博士期间,主要研究芝加哥最贫穷的黑人社区,花了十年时间混迹于贩毒帮派,将所见所闻写成论文,一举成名。

主要内容:

本文的背景为20世纪90年代,作者在芝加哥大学攻读博士期间,深入芝加哥贫穷黑人社区,与一个贩毒黑帮“黑暗之王“的老大成为朋友,在他的帮助下,深入社区10年,如实纪录了在这个社区的所见。本文共分为八个章节:

  • 第一章:作为穷困黑人的感觉怎样?作者摆脱常规社会学家统计数据的研究方法,初次进入贫困社区Oakland,并在那里遇到了J.T., “黑暗之王”的高层人物。
  • 第二章:联邦街的最初时光。 描述了J.T.的家庭在罗伯特.泰勒之家,另外一个贫困区的生活情况。
  • 第三章:罩着我的人。 描述了罗伯特.泰勒之家的政治生活。
  • 第四章:黑帮老大的一天。 描述了作者跟着J.T.体验作为黑帮老大的一天。整个贩卖毒品的生态链浮出水面。
  • 第五章: 贝利女士的街区。描述了罗伯特.泰勒之家的管理者贝利女士是如何一方面帮助社区,一方面剥削普通住户以及与警察狼狈为奸。
  • 第六章: 混混与混迹。
  • 第七章: 黑与蓝。 描述了社区警察与黑帮如何勾结从中获得利益
  • 第八章: 团结的帮派。 描述了罗伯特.泰勒街区面临拆迁,住户是如何团结一起面对。

读后感想:

一、对作者学术精神的敬意

为了能获得贫困社区的第一手资料,作者能够深入毒枭内部,一待就是10年。见过几次枪战以及死亡,他并没有退却。他的恐惧神经似乎被切断,使他感受不到危险。真是佩服至极。要想真正了解一个事物,只有深入其内部,并且与他共同生活,才能认识的全面深刻,是作者身体力行给我最大的感受。没有调查就没有话语权。

二、黑人并没有那么可怕

每每见到黑人,首先联想到的就是危险。记得上次在公司,见到一个黑人朋友,知道他的受教育程度很高,可是依然不舒服。就像朋友说的,这是大数据统计事件后的正常反应,可是值得思考的是为什么黑人的犯罪率会那么高?人之初,性本善!正如书中第二章中描述的J.T.的母亲。她是一个很有尊严并且善良的女人。以及J.T.,虽然在贩毒,但是却是一个孝敬母亲与有责任感的丈夫和父亲。社区中虽然存在着不和谐的声音,但是也有温情存在。还有很多虽然干着违法的事,但是内在却是渴望平静、正常的生活的。

三、底层平民永远是最悲惨的

在作者跟随J.T.一起做黑帮大哥的一天中,他描述了作为最底层贩卖毒品的那些马仔们的生活。寒风凛冽中,瑟瑟发抖却依然要卖,拿着最微弱的工资,毒品的大头全部上交。还有作为管理者的贝利女士只有收了好处才能办事。这些底层黑人在挣扎中生存。还有书中反复描写的,救护车和警察是不愿意去那些贫困街区的。

四、种族平等之路漫长

有些社会学家认为有一种“贫困文化”,即穷困黑人不工作,乃是因为他们并不像其他的族群那样珍视工作,而且这一态度会代代相传。然而这是正确的吗?给予黑人和白人是同等平等的工作机会吗?真正平等,就得看上层中有没有代表底层黑人意志的人物存在,在政策制定的时候能够给予贫困黑人更多的关怀,而不是置于他们的生死不顾!

真是感叹生活在如此美好的国度与时代下。但是无论在哪个国家,地区,在城市边缘总会存在一些黑暗面,其中的人可能连一日三餐都很难保证。如果这种最原始的欲望都得不到满足,可想而知,他们会为了一顿饱饭能干出多少违规违法的事情。他们是渴望得到肯定和关注的,如果统治者能给予更多的关注于他们,我想多少是可以改善的。

Classify images using python interface in caffe

I have fine tuned the existed model and trained it before. The issue about how to classify the images will come along. So In this chapter, I will explain how to call python interface to classify images.

Here is the python source code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
import numpy as np
caffe_root = "/home/your_name/Downloads/caffe-master/"
import sys
sys.path.insert(0, caffe_root+"python")
import os
import caffe
import io

caffe.set_mode_cpu()
MODEL_FILE = caffe_root + "models/colon_caffenet/colon_deploy.prototxt"
#change the deploy file same as the train_val.prototxt
PRETRAINED = caffe_root + "models/colon_caffenet/caffenet_train_iter_80.caffemodel"
MEAN = caffe_root + "examples/colon/colon_mean.npy"
# later I will tell how to tranform *.binaryproto to *.npy
net = caffe.Classifier(MODEL_FILE, PRETRAINED, mean = np.load(MEAN).mean(1).mean(1),
channel_swap=(2,1,0),
raw_scale=255,
image_dims=(256, 256))

filewriter = open(caffe_root+"data/colon/test_result.txt","w+")
for root,dirs,files in os.walk(caffe_root+"data/colon/test/"): # all the images are in test folder
for file in files
IMAGE_FILE = os.path.join(root,file).decode('gbk').encode('utf8')
input_image = caffe.io.load_image(IMAGE_FILE)
prediction = net.predict([input_image])
string = os.path.basename(IMAGE_FILE)+" "+str(prediction[0].argmax())+"\n"
filewriter(string)
print os.path.basename(IMAGE_FILE), prediction[0].argmax()

filewriter.close()

convert binaryproto to npy

1
2
3
4
5
6
7
8
9
10
11
12
13
14
import caffe
import numpy as np
import sys

if len(sys.argv) != 3:
print "Usage: python convert_protomean.py proto.mean out.npy"
sys.exit()

blob = caffe.proto.caffe_pb2.BlobProto()
data = open( sys.argv[1] , 'rb' ).read()
blob.ParseFromString(data)
arr = np.array( caffe.io.blobproto_to_array(blob) )
out = arr[0]
np.save( sys.argv[2] , out )

Meets lots of problems when import caffe

When python executes “import caffe”, many problems comes out. All problems are about the python packages. Although I have set up the python library pointing to anaconda2 lib, it seems programme will find all the dependencies in /usr/lib/python2.7/dist-packages. So if you have encountered the same issues, copy all dependencies in /anaconda2/lib/python2.7/site-packages to /usr/lib/python2.7/dist-packages

1
sudo cp -r /anaconda2/lib/python2.7/site-packages/* /usr/lib/python2.7/dist-packages

Import cv2: No Module named cv2

copy /usr/lib/python2.7/dist-packages/cv2.so to /anaconda2/lib/python2.7/site-packages/

ValueError: Mean shape incompatible with input shape

The problem is related to io.py, here is the solution

Let go to line 253-253 in caffe-master/python/caffe/io.py Replace

1
2
if ms != self.inputs[in_][1:]:
raise ValueError('Mean shape incompatible with input shape.')

by

1
2
3
4
5
6
7
if ms != self.inputs[in_][1:]:
print(self.inputs[in_])
in_shape = self.inputs[in_][1:]
m_min, m_max = mean.min(), mean.max()
normal_mean = (mean - m_min) / (m_max - m_min)
mean = resize_image(normal_mean.transpose((1,2,0)),in_shape[1:]).transpose((2,0,1)) * (m_max - m_min) + m_min
#raise ValueError('Mean shape incompatible with input shape.')

Rebuild your code. It will be fine.

fine tune ImageNet Model for image classification

Since I have installed Caffe, here I will adopt it to my own application. The aim of my examples is for binary classification of images. I will exploit the trained bvlc_reference_caffenet model and then fine tune it for my application.

The whole steps are as follows:


[TOC]

Data Preparation

Prepare original data

You need to prepare four files:

  • train folder which contains the training images
  • val folder which contains the testing images
  • train.txt file which contains the labels of training images.
  • val.txt file which contains the labels of testing images.

    Note that the order of image name in *.txt file is equal to that in train folder and val folder.

The train.txt looks like:

1

That means train folder includes two subfolders cat and dog. In each subfolder, it contains the cat or dog images relatively.

What you need to do is going to caffe_master/data folder, creating a new folder named myself, then putting the four files above into it.

So now in myself folder, you can see four files: train, val, train.txt, val.txt

Transform data format

Later we will use some tools to transform the image files to the data format which ImageNet model consumes.

  1. Copy all *.sh files in caffe-master/examples/imagenet to myself folder.
  2. Change the file path in create_imagenet.sh
  3. Run it and then it will generate myself_train_lmdb and myself_val_lmdb in myself

Compute the mean value

  1. Change the file path in make_imagenet_mean.sh
  2. Run it and it will generate myself_mean.binaryproto in myself

Okay, till now, you have prepared all the data the ImageNet Model needs.

Under myself folder, you can see:

  • myself_train_lmdb folder: it contains the training data
  • myself_val_lmdb folder: it contains the testing data
  • myself_mean.binaryproto: it is the mean value

Fine Tune the trained model

Firstly, we need to download the trained model.

1
2
# the root path is "caffe-master"
./scripts/download_model_binary.py models/bvlc_reference_caffenet # it will take some time to download it

After that you need to fine tune it.

  • change the train_val.prototxt
    (1) change the input data path relatively

    (2) We change the name of last layer from “fc8” to “fc8_myself”. Because there is no layer named “fc8_myself”, it’s weights will be retrained according to random weights.

    (3) change the params of the last layer

1
2
3
4
5
6
7
8
9
10
11
12
13
param{
lr_mult: 10
decay_mult: 1
}
param{
lr_mult: 20
decay_mult: 0
}
inner_product_param{
num_output: 2
...
...
}
  • set up the solver.prototxt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
  net:"models/myself_caffenet/train_val.prototxt" 
test_iter: 100
test_interval: 500
base_lr: 0.001
lr_policy: "step"
gamma: 0.1
stepsize: 10000
display: 20
max_iter: 50000 # numofTotalImg / batch_size * 10
momentum: 0.9
weight_decay: 0.0005
snapshot: 2000
snapshot_prefix: "models/myself_caffenet/caffenet_train"
solver_mode: GPU

Train the model

1
2
3
./build/tools/caffe train \
-solver models/myself_caffenet/solver.prototxt \
-weights models/myself_caffenet/bvlc_reference_caffenet.caffemodel

The results will be showed as follows:
train

Test the model

1
./build/tools/caffe test -model=models/myself_caffenet/train_val.prototxt -weights=models/myself_caffenet/caffenet_iter_1000.caffemodel

The results will be showed:
train

Ubuntu 14.04 configure Caffe

These days, I am in the middle of the configuration of Caffe in Ubuntu14.04 due to the need of a small project. Here I want to write down the steps to show how to install it.

If you follow the installation instructions of the official site, I think you will be crazy. The documents are not in order. I succeeded to install it according to a Chinese blog. Apart from simple translation, I will also add some contents.

The whole process is as follows:

Environments:

  • Ubuntu 14.04 64bit
  • 8G Memory
  • GeForce GT 705 Graphics Card
  • CUDA 7.5
  • caffe from github

Steps:

  • Install all the dependencies
  • Install CUDA 7.5
  • Install Atlas
  • Install OpenCV
  • Install Anaconda2
  • Install Caffe
  • Compile Python wrapper
  • Test Caffe

Install all the dependencies

1
2
sudo apt-get install build-essential  # basic requirement  
sudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libboost-all-dev libhdf5-serial-dev libgflags-dev libgoogle-glog-dev liblmdb-dev protobuf-compiler #required by caffe

Install CUDA 7.5

You can follow the office website instructions install it. Before you install the packages, you’d better check it whether the package is intact through md5.

1
md5sum package_name

After you finish the installation, please reboot the computer.

1
sudo reboot

Later, we need to configure the environmental variable for cuda.

1
2
3
4
5
6
sudo gedit /etc/profile
# add the content into /etc/profile
export PATH=/usr/local/cuda/bin:$PATH

# make the profile into effect
source /etc/profile

At the same time, we need to add the library path to /etc/ld.so.conf.d folder.

1
2
3
4
5
6
7
cd /etc/ld.so.conf.d
sudo vim cuda.conf
# add the contents into cuda.conf file
/usr/local/cuda/lib64

# make the file into effect
sudo ldconfig

Now we install CUDA SAMPLE

1
2
3
4
5
6
cd /usr/local/cuda/samples
sudo make all -j4 # 4 means the number of your cpu cores
# It will take about 10 mins to complete installation

cd samples/bin/x86_64/linux/release
./deviceQuery
  1. ./deviceQuery Starting…
  2. CUDA Device Query (Runtime API) version (CUDART static linking)
  3. Detected 1 CUDA Capable device(s)
  4. Device 0: “GeForce GTX 670”
  5. CUDA Driver Version / Runtime Version 6.5 / 6.5
  6. CUDA Capability Major/Minor version number: 3.0
  7. Total amount of global memory: 4095 MBytes (4294246400 bytes)
  8. ( 7) Multiprocessors, (192) CUDA Cores/MP: 1344 CUDA Cores
  9. GPU Clock rate: 1098 MHz (1.10 GHz)
  10. Memory Clock rate: 3105 Mhz
  11. Memory Bus Width: 256-bit
  12. L2 Cache Size: 524288 bytes
  13. Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
  14. Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers
  15. Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers
  16. Total amount of constant memory: 65536 bytes
  17. Total amount of shared memory per block: 49152 bytes
  18. Total number of registers available per block: 65536
  19. Warp size: 32
  20. Maximum number of threads per multiprocessor: 2048
  21. Maximum number of threads per block: 1024
  22. Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  23. Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
  24. Maximum memory pitch: 2147483647 bytes
  25. Texture alignment: 512 bytes
  26. Concurrent copy and kernel execution: Yes with 1 copy engine(s)
  27. Run time limit on kernels: Yes
  28. Integrated GPU sharing Host Memory: No
  29. Support host page-locked memory mapping: Yes
  30. Alignment requirement for Surfaces: Yes
  31. Device has ECC support: Disabled
  32. Device supports Unified Addressing (UVA): Yes
  33. Device PCI Bus ID / PCI location ID: 1 / 0
  34. Compute Mode:
  35. < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
  36. deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 6.5, CUDA Runtime Version = 6.5, NumDevs = 1, Device0 = GeForce GTX 670
  37. Result = PASS

If you command window shows the similar result, that means your display card driver is installed successfully, otherwise there will be something wrong with your installation.

Install Atlas

1
sudo apt-get install libatlas-base-dev

Install OpenCV-2.4.10

  1. Download the installation script
  2. Go to the Install-OpenCV/Ubuntu/2.4 folder
    1
    sudo sh ./opencv2_4_10.sh

Install Anaconda2

  • Download Anaaconda2 from official website
  • Unzip and go to the folder

    1
    sudo sh ./Anaconda2-2.4.1-Linux-x86_64.sh
  • Add the Anaconda Library Path

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    sudo vim /etc/ld.so.conf
    # add the contents at the end of the file
    /home/your_name/anaconda2/lib
    # make the file into effect
    sudo ldconfig

    sudo vim ~/.bashrc
    export LD_LIBARY_PATH="/home/your_username/anaconda2/lib:$LD_LIBRARY_PATH"

    export PATH="/home/your_username/anaconda2/bin:$PATH"

    source ~/.bashrc

    #test python version

pythonversion

Install Caffe

  • Download Caffe from github

    then we need to download python dependencies

    1
    2
    3
    cd caffe-master/python
    for req in $(cat requirements.txt);
    do sudo pip install $req; done

    make sure all the dependencies have installed properly, otherwise we will meet many issues when compiling caffe

  • Compile Caffe

1
2
3
cd caffe-master
cp Makefile.config.example Makefile.config
sudo vim Makefile.config

We should make some changes to the configuration.

1
2
3
4
5
6
7
8
# Comment out these lines
ANACONDA_HOME := $(HOME)/anaconda2
PYTHON_INCLUDE := $(ANACONDA_HOME)/include \
$(ANACONDA_HOME)/include/python2.7 \
$(ANACONDA_HOME)/lib/python2.7/site-packages/numpy/core/include \

# PYTHON_LIB := /usr/lib
PYTHON_LIB := $(ANACONDA_HOME)/lib
1
2
3
4
# make sure your current foler is caffe-master
sudo make all -j4
sudo make test
sudo runtest

Compile Python Wrapper

1
2
# make sure your current folder is caffe-master
sudo make pycaffe

Ok, you do a very good job! Congratulations! You have finished all the steps to install caffe. Now let’s make a small test to feel the performance of caffe.

Test

We will use the mnist example to make a test. I also followed a Chinese blog

1
sudo sh data/mnist/get_mnist.sh     # get training data and testing data from the internet

You will see four files in the folder of data/mnist

  • train-images-idx3-ubyte (training samples)
  • train-labels-idx1-ubyte (training samples’ labels)
  • t10k-images-idx3-ubyte (testing samples)
  • t10k-labels-idx1-ubyte (testing samples’ labels)
1
2
3
sudo sh examples/mnist/create_mnist.sh  # transfer the format of original datasets

sudo time sh examples/mnist/train_lenet.sh # make it run(cpu 13mins; GPU 4mins; GPU+cudnn 40 s)

Check info of VGA card in Linux

Just to note down some commands for future reference.

These commands are used for check the hardware information of VGA cards in Linux.

lspci -vnn | grep VGA -A 12

lshw -C display

sudo lshw -c video | grep configuration

Mac configure Maven3

Make Sure JAVA_HOME has been setup

Before we configure Maven, we need to make sure JAVA_HOME is setup properly.

1
echo $JAVA_HOME

if there is nothing comes out, that means you didn’t set JAVA_HOME up before. You can follow the instructions as follow. If you have configured it correctly, just ignore it.

1
2
3
4
5
6
7
sudo vi /etc/profile(or ~/.bash_profile or /etc/bashrc)
//add the content into file
export JAVA_HOME=$(/usr/libexec/java_home)
source /etc/profile

echo $JAVA_HOME
/Library/Java/JavaVirtualMachines/jdk1.8.0_66.jdk/Contents/Home

Set up Maven

We can download the latest version of maven, then unzip to the installing folder(/usr/local/opt/maven3). Later, we also need to register maven into environment variable files.

1
2
3
4
5
6
7
8
9
10
11
sudo vi /etc/profile
//add the content into file
export M2_HOME=/usr/local/opt/maven3/apache-maven-3.3.9
export PATH=$M2_HOME/bin:$PATH
source /etc/profile

mvn -v
Java version: 1.8.0_66, vendor: Oracle Corporation
Java home: /Library/Java/JavaVirtualMachines/jdk1.8.0_66.jdk/Contents/Home/jre
Default locale: zh_CN, platform encoding: UTF-8
OS name: "mac os x", version: "10.10", arch: "x86_64", family: "mac"

Maven3 downloads jars in pom.xml

If we want to download jars from command line according to pom.xml, we can write a .bat file which is in the same folder with pom.xml.

call mvn -f pom.xml dependency:copy-dependencies

@pause

Then click the bat file. It will download jars automatically into your local repo(~/usr/.m2)

Spark1.5.1 Cluster Environment Setup

Basically, I followed this blog to deploy the clusters’ environment. Instead of using spark 1.3.0, we adopt the latest version 1.5.1. But it doesn’t influence the configuration.

The issue maybe you would come across

I met a problem in the process of free-password login with SSH. Two of machines could access each other. Only one machine got trapped that other two machines couldn’t access it. After some investigation, we found it was due to the authority.

The authority of .ssh folder must be confined to 755

The authority of authorized_keys must be confined to 600