Seven Story Rabbit Hole

Sometimes awesome things happen in deep rabbit holes. Or not.

   images

Running CBFS

This will walk you through getting a cbfs cluster up and running.

What is CBFS?

cbfs is a distributed filesystem on top of Couchbase Server, not unlike Mongo’s GridFS or Riak’s CS. It’s currently an experimental “skunkworks” project and hasn’t quite reached official project status.

Here’s a typical deployment architecture:

cbfs overview

Although not shown, all cbfs daemons can communicate with all Couchbase Server instances.

It is not required to run cbfs on the same machine as Couchbase Server, but it is meant to be run in the same data center as Couchbase Server.

If you want a deeper understanding of how cbfs works, check the cbfs presentation or this blog post.

Kick off a Couchbase Cluster

cbfs depends on having a Couchbase cluster running.

Follow all of the steps in Running Couchbase Cluster Under CoreOS on AWS to kick off a 3 node Couchbase cluster.

Add security groups

A few ports will need to be opened up for cbfs.

Go to the AWS console andeEdit the Couchbase-CoreOS-CoreOSSecurityGroup-xxxx security group and add the following rules:

1
2
3
4
Type             Protocol  Port Range Source  
----             --------  ---------- ------
Custom TCP Rule  TCP       8484       Custom IP: sg-6e5a0d04 (copy and paste from port 4001 rule)
Custom TCP Rule  TCP       8423       Custom IP: sg-6e5a0d04 

At this point your security group should look like this:

security group

Create a new bucket for cbfs

Open Couchbase Server Admin UI

In the AWS EC2 console, find the public IP of one of the instances (it doesn’t matter which)

In your browser, go to http://<public_ip>:8091/

Create Bucket

Go to Data Buckets / Create New Bucket

Enter cbfs for the name of the bucket.

Leave all other settings as default.

create bucket

ssh in

In the AWS EC2 console, find the public IP of one of the instances (it doesn’t matter which)

ssh into one of the machines:

1
$ ssh -A core@<public_ip>

Run cbfs

Create a volume dir

Since the fileystem of a docker container is not meant for high throughput io, a volume should be used for cbfs.

Create a directory on the host OS (i.e., on the Core OS instance)

1
2
$ sudo mkdir -p /var/lib/cbfs/data
$ sudo chown -R core:core /var/lib/cbfs

This will be mounted by the docker container in the next step.

Generate fleet unit files

1
2
$ wget https://gist.githubusercontent.com/tleyden/d70161c3827cb8b788a8/raw/8f6c81f0095b0007565e9b205e90afb132552060/cbfs_node.service.template
$ for i in `seq 1 3`; do cp cbfs_node.service.template cbfs_node.$i.service; done

Start cbfs on all cluster nodes

1
$ fleetctl start cbfs_node.*.service

Run fleetctl list-units to list the units running in your cluster. You should have the following:

1
2
3
4
5
6
7
8
9
$ fleetctl list-units
UNIT                                            MACHINE                         ACTIVE    SUB
cbfs_node.1.service                             6ecff20c.../10.51.177.81        active    running
cbfs_node.2.service                             b8eb6653.../10.79.155.153       active    running
cbfs_node.3.service                             02d48afd.../10.186.172.24       active    running
couchbase_bootstrap_node.service                02d48afd.../10.186.172.24       active    running
couchbase_bootstrap_node_announce.service       02d48afd.../10.186.172.24       active    running
couchbase_node.1.service                        6ecff20c.../10.51.177.81        active    running
couchbase_node.2.service                        b8eb6653.../10.79.155.153       active    running

View cbfs output

1
2
3
4
5
6
7
$ fleetctl journal cbfs_node.1.service
2014/11/14 23:18:58 Connecting to couchbase bucket cbfs at http://10.51.177.81:8091/
2014/11/14 23:18:58 Error checking view version: MCResponse status=KEY_ENOENT, opcode=GET, opaque=0, msg: Not found
2014/11/14 23:18:58 Installing new version of views (old version=0)
2014/11/14 23:18:58 Listening to web requests on :8484 as server 10.51.177.81
2014/11/14 23:18:58 Error removing 10.51.177.81's task list: MCResponse status=KEY_ENOENT, opcode=DELETE, opaque=0, msg: Not found
2014/11/14 23:19:05 Error updating space used: Expected 1 result, got []

Run cbfs client

Run a bash shell in a docker container that has cbfsclient pre-installed:

1
$ sudo docker run -ti --net=host tleyden5iwx/cbfs /bin/bash

Upload a file

From within the docker container launched in the previous step:

1
2
3
# echo "foo" > foo
# ip=$(hostname -i | tr -d ' ')
# cbfsclient http://$ip:8484/ upload foo /foo

There should be no errors. If you run fleetctl journal cbfs_node.1.service again on the CoreOS instance, you should see log messages like:

1
2014/11/14 21:51:43 Recorded myself as an owner of e242ed3bffccdf271b7fbaf34ed72d089537b42f: result=success

List directory

1
2
# cbfsclient http://$ip:8484/ ls /
foo

It should list the foo file we uploaded earlier.

Congratulations! You now have cbfs up and running.

References

An Example of Using NSQ From Go

NSQ is a message queue, similar to RabbitMQ. I decided I’d give it a whirl.

Install Nsq

1
2
3
$ wget https://s3.amazonaws.com/bitly-downloads/nsq/nsq-0.2.31.darwin-amd64.go1.3.1.tar.gz
$ tar xvfz nsq-0.2.31.darwin-amd64.go1.3.1.tar.gz
$ sudo mv nsq-0.2.31.darwin-amd64.go1.3.1/bin/* /usr/local/bin

Launch Nsq

1
2
3
$ nsqlookupd & 
$ nsqd --lookupd-tcp-address=127.0.0.1:4160 &
$ nsqadmin --lookupd-http-address=127.0.0.1:4161 &

Get Go client library

1
$ go get -u -v github.com/bitly/go-nsq

Create a producer

Add the following code to main.go:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
package main

import (
  "log"
  "github.com/bitly/go-nsq"
)

func main() {
  config := nsq.NewConfig()
  w, _ := nsq.NewProducer("127.0.0.1:4150", config)

  err := w.Publish("write_test", []byte("test"))
  if err != nil {
      log.Panic("Could not connect")
  }

  w.Stop()
}

and then run it with:

1
$ go run main.go

If you go to your NSQAdmin at http://localhost:4171, you should see a single message in the write_test topic.

NSQAdmin

Create a consumer

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
package main

import (
  "log"
  "sync"

  "github.com/bitly/go-nsq"
)

func main() {

  wg := &sync.WaitGroup{}
  wg.Add(1)

  config := nsq.NewConfig()
  q, _ := nsq.NewConsumer("write_test", "ch", config)
  q.AddHandler(nsq.HandlerFunc(func(message *nsq.Message) error {
      log.Printf("Got a message: %v", message)
      wg.Done()
      return nil
  }))
  err := q.ConnectToNSQD("127.0.0.1:4150")
  if err != nil {
      log.Panic("Could not connect")
  }
  wg.Wait()

}

and then run it with:

1
$ go run main.go

You should see output:

1
2
2014/11/12 08:37:29 INF    1 [write_test/ch] (127.0.0.1:4150) connecting to nsqd
2014/11/12 08:37:29 Got a message: &{[48 55 54 52 48 57 51 56 50 100 50 56 101 48 48 55] [116 101 115 116] 1415810020571836511 2 0xc208042118 0 0}

Congratulations! You just pushed a message through NSQ.

Enhanced consumer: use NSQLookupd

The above example hardcoded the ip of nsqd into the consumer code, which is not a best practice. A better way to go about it is to point the consumer at nsqlookupd, which will transparently connect to the appropriate nsqd that happens to be publishing that topic.

In our example, we only have a single nsqd, so it’s an extraneous lookup. But it’s good to get into the right habits early, especially if you are a habitual copy/paster.

The consumer example only needs a one-line change to get this enhancement:

1
err := q.ConnectToNSQLookupd("127.0.0.1:4161")

Which will connect to the HTTP port of nsqlookupd.

CoreOS With Nvidia CUDA GPU Drivers

This will walk you through installing the Nvidia GPU kernel module and CUDA drivers on a docker container running inside of CoreOS.

architecture diagram

Launch CoreOS on an AWS GPU instance

  • Launch a new EC2 instance

  • Under “Community AMIs”, search for ami-7c8b3f14 (CoreOS-stable-410.1.0-hvm)

  • Select the GPU instances: g2.2xlarge

  • Increase root EBS store from 8 GB –> 20 GB to give yourself some breathing room

ssh into CoreOS instance

Find the public ip of the EC2 instance launched above, and ssh into it:

1
$ ssh -A core@ec2-54-80-24-46.compute-1.amazonaws.com

Run Ubuntu 12 docker container in privileged mode

1
$ sudo docker run --privileged=true -i -t ubuntu:12.04 /bin/bash

After the above command, you should be inside a root shell in your docker container. The rest of the steps will assume this.

Install build tools + other required packages

1
2
$ apt-get update
$ apt-get install build-essential wget git

Prepare CoreOS kernel source

Clone CoreOS kernel repository

1
2
3
$ mkdir -p /usr/src/kernels
$ cd /usr/src/kernels
$ git clone https://github.com/coreos/linux.git

Get CoreOS kernel version

1
2
$ uname -a
Linux ip-10-183-54-167.ec2.internal 3.15.8+ #2 SMP Fri Sep 26 08:37:17 UTC 2014 x86_64 Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz GenuineIntel GNU/Linux

The CoreOS kernel version is 3.15.8

Switch correct branch for this kernel version

1
2
$ cd linux
$ git checkout remotes/origin/coreos/v3.15.8

Create kernel configuration file

1
$ zcat /proc/config.gz > /usr/src/kernels/linux/.config

Prepare kernel source for building modules

1
2
$ cd /usr/src/kernels/linux
$ make modules_prepare

Now you should be ready to install the nvidia driver.

Install nvidia driver

Download

1
2
3
$ mkdir -p /opt/nvidia
$ cd /opt/nvidia
$ wget http://developer.download.nvidia.com/compute/cuda/6_5/rel/installers/cuda_6.5.14_linux_64.run

Unpack

1
2
3
$ chmod +x cuda_6.5.14_linux_64.run
$ mkdir nvidia_installers
$ ./cuda_6.5.14_linux_64.run -extract=`pwd`/nvidia_installers

Install

1
2
$ cd nvidia_installers
$ ./NVIDIA-Linux-x86_64-340.29.run --kernel-source-path=/usr/src/kernels/linux/

Installer Questions

  • Install NVidia’s 32-bit compatibility libraries? YES
  • Would you like to run nvidia-xconfig? NO

If everything worked, you should see:

nvidia drivers installed

Load nvidia kernel module

1
$ modprobe nvidia

No errors should be returned. Verify it’s loaded by running:

1
$ lsmod | grep -i nvidia

and you should see:

1
2
nvidia              10533711  0
i2c_core               41189  2 nvidia,i2c_piix4

Install CUDA

1
2
$ ./cuda-linux64-rel-6.5.14-18749181.run
$ ./cuda-samples-linux-6.5.14-18745345.run

Verify CUDA

1
2
3
$ cd /usr/local/cuda/samples/1_Utilities/deviceQuery
$ make
$ ./deviceQuery   

You should see the following output:

1
2
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 6.5, CUDA Runtime Version = 6.5, NumDevs = 1, Device0 = GRID K520
Result = PASS

Congratulations! You now have a docker container running under CoreOS that can access the GPU.

Appendix A: Using Core OS Alpha

The instructions above were for an older version of CoreOS. The following instructions are for Core OS Alpha, and might possibly work on the current version of CoreOS stable (444.5.0). Only the parts that differ from above steps are listed:

Launch CoreOS Alpha on an AWS GPU instance

  • Under “Community AMIs”, search for ami-66e6680e (CoreOS-alpha-490.0.0-hvm)

Run Ubuntu 14 docker container in privileged mode

1
$ sudo docker run --privileged=true -i -t ubuntu:14.04 /bin/bash

After the above command, you should be inside a root shell in your docker container. The rest of the steps will assume this.

Install build tools + other required packages

In order to match the version of gcc that was used to build the CoreOS kernel. (gcc 4.7)

1
2
$ apt-get update
$ apt-get install gcc-4.7 g++-4.7 wget git make dpkg-dev

Set gcc 4.7 as default

1
2
3
$ update-alternatives --remove gcc /usr/bin/gcc-4.8
$ update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-4.7 60 --slave /usr/bin/g++ g++ /usr/bin/g++-4.7
$ update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-4.8 40 --slave /usr/bin/g++ g++ /usr/bin/g++-4.8

Verify

1
$ update-alternatives --config gcc

It should list gcc 4.7 with an asterisk next to it:

1
* 0            /usr/bin/gcc-4.7   60        auto mode

Prepare CoreOS kernel source

Get CoreOS kernel version

1
2
$ uname -a
Linux ip-10-11-167-200.ec2.internal 3.17.2+ #2 SMP Tue Nov 4 04:15:48 UTC 2014 x86_64 Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz GenuineIntel GNU/Linux

The CoreOS kernel version is 3.17.2

Switch correct branch for this kernel version

1
2
$ cd linux
$ git checkout remotes/origin/coreos/v3.17.2

Appendix B: Expose GPU to other docker containers

If you need other docker containers on this CoreOS instance to be able to access the GPU, you can do the following steps.

Note: you need to be using CoreOS-alpha-490.0.0 or later, since this requires Docker 1.3 to work.

Exit docker container

1
$ exit

You should be back to your CoreOS shell.

Add nvidia device nodes

1
2
3
$ wget https://gist.githubusercontent.com/tleyden/74f593a0beea300de08c/raw/95ed93c5751a989e58153db6f88c35515b7af120/nvidia_devices.sh
$ chmod +x nvida_devices.sh
$ sudo ./nvida_devices.sh

Verify device nodes

1
2
3
4
$ ls -alh /dev | grep -i nvidia
crw-rw-rw-  1 root root  251,   0 Nov  5 16:37 nvidia-uvm
crw-rw-rw-  1 root root  195,   0 Nov  5 16:37 nvidia0
crw-rw-rw-  1 root root  195, 255 Nov  5 16:37 nvidiactl

Launch docker containers

When you launch other docker containers on the same CoreOS instance, to allow them to access the GPU device you will need to add the following arguments:

1
$ sudo docker run -ti --device /dev/nvidia0:/dev/nvidia0 --device /dev/nvidiactl:/dev/nvidiactl --device /dev/nvidia-uvm:/dev/nvidia-uvm tleyden5iwx/ubuntu-cuda /bin/bash

A complete example is available in Docker on AWS GPU Ubuntu 14.04 / CUDA 6.5. You can pick up at th Run GPU enabled docker image step.

References

Running Couchbase Cluster Under CoreOS on AWS

Here are instructions on how to fire up a Couchbase Server 2.2 cluster running under CoreOS on AWS CloudFormation. You will end up with the following system:

architecture diagram

Launch CoreOS instances via AWS Cloud Formation

Click the “Launch Stack” button to launch your CoreOS instances via AWS Cloud Formation:

NOTE: this is hardcoded to use the us-east-1 region, so if you need a different region, you should edit the URL accordingly

Use the following parameters in the form:

  • ClusterSize: 3 nodes (default)
  • Discovery URL: as it says, you need to grab a new token from https://discovery.etcd.io/new and paste it in the box.
  • KeyPair: use whatever you normally use to start EC2 instances. For this discussion, let’s assumed you used aws, which corresponds to a file you have on your laptop called aws.cer

ssh into a CoreOS instance

Go to the AWS console under EC2 instances and find the public ip of one of your newly launched CoreOS instances.

Choose any one of them (it doesn’t matter which), and ssh into it as the core user with the cert provided in the previous step:

1
$ ssh -i aws.cer -A core@ec2-54-83-80-161.compute-1.amazonaws.com

Sanity check

Let’s make sure the CoreOS cluster is healthy first:

1
$ fleetctl list-machines

This should return a list of machines in the cluster, like this:

1
2
3
4
MACHINE          IP              METADATA
03b08680...     10.33.185.16    -
209a8a2e...     10.164.175.9    -
25dd84b7...     10.13.180.194   -

Download cluster-init script

1
2
$ wget https://raw.githubusercontent.com/tleyden/couchbase-server-coreos/master/2.2/scripts/cluster-init.sh
$ chmod +x cluster-init.sh

This script is not much. I wrapped things up in a script because the instructions were getting long, but all it does is:

  • Downloads a few fleet init files from github.
  • Generates a few more fleet init files based on a template and the number of nodes you want.
  • Stashes the username/password argument you give it into etcd.
  • Tells fleetctl to kick everything off. Whee!

Launch cluster

Run the script you downloaded in the previous step:

1
$ ./cluster-init.sh -n 3 -u "user:passw0rd"

Where:

  • -n the total number of couchbase nodes to start — should correspond to number of ec2 instances (eg, 3)
  • -u the username and password as a single string, delimited by a colon (:)

Replace user:passw0rd with a sensible username and password. It must be colon separated, with no spaces. The password itself must be at least 6 characters.

Once this command completes, your cluster will be in the process of launching.

Verify

To check the status of your cluster, run:

1
$ fleetctl list-units

You should see four units, all as active.

1
2
3
4
5
UNIT                     MACHINE             ACTIVE  SUB
couchbase_bootstrap_node.service                375d98b9.../10.63.168.35  active  running
couchbase_bootstrap_node_announce.service       375d98b9.../10.63.168.35  active  running
couchbase_node.1.service                        8cf54d4d.../10.187.61.136 active  running
couchbase_node.2.service                        b8cf0ed6.../10.179.161.76 active  running

Rebalance Couchbase Cluster

Login to Couchbase Server Web Admin

  • Find the public ip of any of your CoreOS instances via the AWS console
  • In a browser, go to http://<instance_public_ip>:8091
  • Login with the username/password you provided above

After logging in, your Server Nodes tab should look like this:

screenshot

Kick off initial rebalance

  • Click server nodes
  • Click “Rebalance”

After the rebalance is complete, you should see:

screenshot

Congratulations! You now have a 3 node Couchbase Server cluster running under CoreOS / Docker.

References

Goroutines vs Threads

Here are some of the advantages of Goroutines over threads:

  • You can run more goroutines on a typical system than you can threads.
  • Goroutines have growable segmented stacks.
  • Goroutines have a faster startup time than threads.
  • Goroutines come with built-in primitives to communicate safely between themselves (channels).
  • Goroutines allow you to avoid having to resort to mutex locking when sharing data structures.
  • Goroutines are multiplexed onto a small number of OS threads, rather than a 1:1 mapping.
  • You can write massively concurrent servers withouth having to resort to evented programming.

You can run more of them

On Java you can run 1000’s or tens of 1000’s threads. On Go you can run hundreds of thousands or millions of goroutines.

Java threads map directly to OS threads, and are relatively heavyweight. Part of the reason they are heavyweight is their rather large fixed stack size. This caps the number of them you can run in a single VM due to the increasing memory overhead.

Go OTOH has a segmented stack that grows as needed. They are “Green threads”, which means the Go runtime does the scheduling, not the OS. The runtime multiplexes the goroutines onto real OS threads, the number of which is controlled by GOMAXPROCS. Typically you’ll want to set this to the number of cores on your system, to maximize potential parellelism.

They let you avoid locking hell

One of the biggest drawback of threaded programming is the complexity and brittleness of many codebases that use threads to achieve high concurrency. There can be latent deadlocks and race conditions, and it can become near impossible to reason about the code.

Go OTOH gives you primitives that allow you to avoid locking completely. The mantra is don’t communicate by sharing memory, share memory by communicating. In other words, if two goroutines need to share data, they can do so safely over a channel. Go handles all of the synchronization for you, and it’s much harder to run into things like deadlocks.

No callback spaghetti, either

There are other approaches to achieving high concurrency with a small number of threads. Python Twisted was one of the early ones that got a lot of attention. Node.js is currently the most prominent evented frameworks out there.

The problem with these evented frameworks is that the code complexity is also high, and difficult to reason about. Rather than “straightline” coding, the programmer is forced to chain callbacks, which gets interleaved with error handling. While refactoring can help tame some of the mental load, it’s still an issue.

Running Caffe on AWS GPU Instance via Docker

This is a tutorial to help you get the Caffe deep learning framework up and running on a GPU-powered AWS instance running inside a Docker container.

Architecture

architecture diagram

Setup host

Before you can start your docker container, you will need to go deeper down the rabbit hole.

You’ll first need to complete the steps here:

Setting up an Ubuntu 14.04 box running on a GPU-enabled AWS instance

After you’re done, you’ll end up with a host OS with the following properties:

  • A GPU enabled AWS instance running Ubuntu 14.04
  • Nvidia kernel module
  • Nvidia device drivers
  • CUDA 6.5 installed and verified

Install Docker

Once your host OS is setup, you’re ready to install docker. (version 1.3 at the time of this writing)

Setup the key for the docker repo:

1
$ sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys 36A1D7869245C8950F966E92D8576A8BA88D21E9

Add the docker repo:

1
2
$ sudo sh -c "echo deb https://get.docker.com/ubuntu docker main > /etc/apt/sources.list.d/docker.list"
$ sudo apt-get update

Install docker:

1
$ sudo apt-get install lxc-docker

Run the docker container

Find your nvidia devices

1
$ ls -la /dev | grep nvidia

You should see:

1
2
3
crw-rw-rw-  1 root root    195,   0 Oct 25 19:37 nvidia0
crw-rw-rw-  1 root root    195, 255 Oct 25 19:37 nvidiactl
crw-rw-rw-  1 root root    251,   0 Oct 25 19:37 nvidia-uvm

You’ll have to adapt the DOCKER_NVIDIA_DEVICES variable below to match your particular devices.

Here’s how to start the docker container:

1
2
$ DOCKER_NVIDIA_DEVICES="--device /dev/nvidia0:/dev/nvidia0 --device /dev/nvidiactl:/dev/nvidiactl --device /dev/nvidia-uvm:/dev/nvidia-uvm"
$ sudo docker run -ti $DOCKER_NVIDIA_DEVICES tleyden5iwx/caffe-gpu /bin/bash

It’s a large docker image, so this might take a few minutes, depending on your network connection.

Run caffe test suite

After the above docker run command completes, your shell will now be inside a docker container that has Caffe installed.

You’ll want run the Caffe test suite and make sure it passes. This will validate your environment, including your GPU drivers.

1
2
$ cd /opt/caffe
$ make test && make runtest

Expected Result: ... [ PASSED ] 838 tests.

Run the MNIST LeNet example

A more comprehensive way to verify your environment is to train the MNIST LeNet example:

1
2
3
4
5
$ cd /opt/caffe/data/mnist
$ ./get_mnist.sh
$ cd /opt/caffe
$ ./examples/mnist/create_mnist.sh
$ ./examples/mnist/train_lenet.sh

This will take a few minutes.

Expected output:

1
2
3
4
5
libdc1394 error: Failed to initialize libdc1394 
I1018 17:02:23.552733    66 caffe.cpp:90] Starting Optimization 
I1018 17:02:23.553583    66 solver.cpp:32] Initializing solver from parameters:
... lots of output ...
I1018 17:17:58.684598    66 caffe.cpp:102] Optimization Done.

Congratulations, you’ve got GPU-powered Caffe running in a docker container — celebrate with a cup of Philz!

References

Docker on AWS GPU Ubuntu 14.04 / CUDA 6.5

Architecture

After going through the steps in this blog post, you’ll end up with this:

architecture diagram

Setup host

Before you can start your docker container, you will need to go deeper down the rabbit hole.

You’ll first need to complete the steps here:

Setting up an Ubuntu 14.04 box running on a GPU-enabled AWS instance

After you’re done, you’ll end up with a host OS with the following properties:

  • A GPU enabled AWS instance running Ubuntu 14.04
  • Nvidia kernel module
  • Nvidia device drivers
  • CUDA 6.5 installed and verified

Install Docker

Once your host OS is setup, you’re ready to install docker. (version 1.3 at the time of this writing)

Setup the key for the docker repo:

1
$ sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys 36A1D7869245C8950F966E92D8576A8BA88D21E9

Add the docker repo:

1
2
$ sudo sh -c "echo deb https://get.docker.com/ubuntu docker main > /etc/apt/sources.list.d/docker.list"
$ sudo apt-get update

Install docker:

1
$ sudo apt-get install lxc-docker

Run GPU enabled docker image

Find all your nvidia devices

1
$ ls -la /dev | grep nvidia

You should see:

1
2
3
crw-rw-rw-  1 root root    195,   0 Oct 25 19:37 nvidia0
crw-rw-rw-  1 root root    195, 255 Oct 25 19:37 nvidiactl
crw-rw-rw-  1 root root    251,   0 Oct 25 19:37 nvidia-uvm

Launch docker container

The easiest way to get going is to use this pre-built docker image that has the cuda drivers pre-installed. Or if you want to build your own, the accompanying dockerfile will be a useful starting point.

You’ll have to adapt the DOCKER_NVIDIA_DEVICES variable below to match your particular devices.

To start the docker container, run:

1
2
$ DOCKER_NVIDIA_DEVICES="--device /dev/nvidia0:/dev/nvidia0 --device /dev/nvidiactl:/dev/nvidiactl --device /dev/nvidia-uvm:/dev/nvidia-uvm"
$ sudo docker run -ti $DOCKER_NVIDIA_DEVICES tleyden5iwx/ubuntu-cuda /bin/bash

After running the above command, you should be at a shell inside your docker container:

1
root@1149788c731c:# 

Verify CUDA access from inside the docker container

Install CUDA samples

1
2
$ cd /opt/nvidia_installers
$ ./cuda-samples-linux-6.5.14-18745345.run -noprompt -cudaprefix=/usr/local/cuda-6.5/

Build deviceQuery sample

1
2
3
$ cd /usr/local/cuda/samples/1_Utilities/deviceQuery
$ make
$ ./deviceQuery   

You should see the following output

1
2
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 6.5, CUDA Runtime Version = 6.5, NumDevs = 1, Device0 = GRID K520
Result = PASS

References

CUDA 6.5 on AWS GPU Instance Running Ubuntu 14.04

Using a pre-built public AMI

Based on the instructions in this blog post, I’ve created an AMI and shared it publicly. So the easiest thing to do is just use that pre-built AMI:

  • Image: ami-2cbf3e44 (Ubuntu Server 14.04 LTS (HVM) – CUDA 6.5)
  • Instance type: g2.2xlarge
  • Storage: Use at least 8 GB, 20+ GB recommended

If you use the pre-built AMI, then you can skip the rest of this article, since all of these steps are “baked in” to the AMI.

Building from scratch

Or if you prefer to build your own instance from scratch, keep reading.

Create a new EC2 instance:

  • Image: ami-9eaa1cf6 (Ubuntu Server 14.04 LTS (HVM), SSD Volume Type)
  • Instance type: g2.2xlarge
  • Storage: Use at least 8 GB, 20+ GB recommended

Install build-essential:

1
$ apt-get update && apt-get install build-essential

Get CUDA installer:

1
$ wget http://developer.download.nvidia.com/compute/cuda/6_5/rel/installers/cuda_6.5.14_linux_64.run

Extract CUDA installer:

1
2
3
$ chmod +x cuda_6.5.14_linux_64.run
$ mkdir nvidia_installers
$ ./cuda_6.5.14_linux_64.run -extract=`pwd`/nvidia_installers

Run Nvidia driver installer:

1
2
$ cd nvidia_installers
$ ./NVIDIA-Linux-x86_64-340.29.run

At this point it will popup an 8-bit UI that will ask you to accept a license agreement, and then start installing.

screenshot

At this point, I got an error:

1
2
3
4
5
6
7
Unable to load the kernel module 'nvidia.ko'.  This happens most frequently when this kernel module was built against the wrong or
         improperly configured kernel sources, with a version of gcc that differs from the one used to build the target kernel, or if a driver
         such as rivafb, nvidiafb, or nouveau is present and prevents the NVIDIA kernel module from obtaining ownership of the NVIDIA graphics
         device(s), or no NVIDIA GPU installed in this system is supported by this NVIDIA Linux graphics driver release.

         Please see the log entries 'Kernel module load error' and 'Kernel messages' at the end of the file '/var/log/nvidia-installer.log'
         for more information.

After reading this forum post I installed:

1
$ sudo apt-get install linux-image-extra-virtual

When it prompted me what do to about the grub changes, I chose “choose package maintainers version”.

Reboot:

1
$ reboot

Disable nouveau

At this point you need to disable nouveau, since it conflicts with the nvidia kernel module.

Open a new file

1
$ vi /etc/modprobe.d/blacklist-nouveau.conf

and add these lines to it

1
2
3
4
5
blacklist nouveau
blacklist lbm-nouveau
options nouveau modeset=0
alias nouveau off
alias lbm-nouveau off

and then save the file.

Disable the Kernel Nouveau:

1
$ echo options nouveau modeset=0 | sudo tee -a /etc/modprobe.d/nouveau-kms.conf

Reboot:

1
2
$ update-initramfs -u
$ reboot

One more try — this time it works

Get Kernel source:

1
2
$ apt-get install linux-source
$ apt-get install linux-headers-3.13.0-37-generic

Rerun Nvidia driver installer:

1
2
$ cd nvidia_installers
$ ./NVIDIA-Linux-x86_64-340.29.run

Load nvidia kernel module:

1
$ modprobe nvidia

Run CUDA + samples installer:

1
2
$ ./cuda-linux64-rel-6.5.14-18749181.run
$ ./cuda-samples-linux-6.5.14-18745345.run

Verify CUDA is correctly installed

1
2
3
$ cd /usr/local/cuda/samples/1_Utilities/deviceQuery
$ make
$ ./deviceQuery   

You should see the following output:

1
2
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 6.5, CUDA Runtime Version = 6.5, NumDevs = 1, Device0 = GRID K520
Result = PASS

References

Debugging Into Android Source

Debugging into the core Android source code can be useful. Here’s how to do it in Android Studio 0.8.2.

Starting out, if we hit a breakpoint where we have a sqlite database object:

screenshot

And if you step in, you get this, which isn’t very useful:

screenshot

To fix that, go to Android SDK, find the API level you are using, and check the Sources for Android SDK box.

screenshot

You must restart Android Studio at this point

Did you restart Android Studio? Now re-run your app in the debugger, and when you try to step into the database.execSQL() method, you should see this:

screenshot

It worked! Now you can debug into any Android code.

Running Couchbase Sync Gateway on Google Compute Engine

First, a quick refresh of what Couchbase Sync Gateway actually is.

So here’s a birds-eye-view of the Couchbase Mobile architecture:

diagram

Sync Gateway allows Couchbase Lite mobile apps to sync data between each other and the Couchbase Server running on the backend.

This blog post will walk you through how to run Sync Gateway in a Docker container on Google Compute Engine.

Create GCE instance and ssh in

Follow the instructions on Running Docker on Google Compute Engine.

At this point you should be ssh’d into your GCE instance

Create a configuration JSON

Here’s a sample example JSON configuration for Sync Gateway which uses walrus as it’s backing store, rather than Couchbase Server. Later we will swap in Couchbase Server as a backing store.

Run Sync Gateway docker container

1
gce:~$ sudo docker run -d -name sg -p 4984:4984 -p 4985:4985 tleyden5iwx/couchbase-sync-gateway sync_gateway "https://gist.githubusercontent.com/tleyden/d97d985eb1e0725e858e/raw"

This will return a container id, eg 8ffb83fd1f.

Check the logs to make sure there are no serious errors in the logs:

1
gce:~$ sudo docker logs 8ffb83fd1f

You should see something along the lines of:

1
2
3
4
5
6
02:23:58.905587 Enabling logging: [REST]
02:23:58.905818 ==== Couchbase Sync Gateway/1.00 (unofficial) ====
02:23:58.905863 Opening db /sync_gateway as bucket "sync_gateway", pool "default", server <walrus:/opt/sync_gateway/data>
02:23:58.905964 Opening Walrus database sync_gateway on <walrus:/opt/sync_gateway/data>
02:23:58.909659 Starting admin server on :4985
02:23:58.913260 Starting server on :4984 ...

Expose API port 4984 via Firewall rule

On your workstation with the gcloud tool installed, run:

1
$ gcloud compute firewalls create sg-4984 --allow tcp:4984

Verify that it’s running

Find out external ip address of instance

On your workstation with the gcloud tool installed, run:

1
2
3
$ gcloud compute instances list
name     status  zone          machineType internalIP   externalIP
couchbse RUNNING us-central1-a f1-micro    10.240.74.44 142.222.178.49

Your external ip is listed under the externalIP column, eg 142.222.178.49 in this example.

Run curl request

On your workstation, replace the ip below with your own ip, and run:

1
$ curl http://142.222.178.49:4984

You should get a response like:

1
{"couchdb":"Welcome","vendor":{"name":"Couchbase Sync Gateway","version":1},"version":"Couchbase Sync Gateway/1.00 (unofficial)"}

Re-run it with Couchbase Server backing store

OK, so we’ve gotten it working with walrus. But have you looked at the walrus website lately? One click and it’s pretty obvious that this thing is not exactly meant to be a scalable production ready backend, nor has it ever claimed to be.

Let’s dump walrus for now and use Couchbase Server from this point onwards.

Start Couchbase Server

Before moving on, you will need to go through the instructions in Running Couchbase Server on GCE in order to get a Couchbase Server instance running.

Stop Sync Gateway

Run this command to stop the Sync Gateway container and completely remove it, using the same container id you used earlier:

1
gce:~$ sudo docker stop 8ffb83fd1f && sudo docker rm 8ffb83fd1f

Update config

Copy this example JSON configuration, which expects a Couchbase Server running on http://172.17.0.2:8091, and update it with the ip address of the docker instance where your Couchbase Server is running. To get this ip address, follow the these instructions in the “Find the Docker instance IP address” section.

Now upload your modified JSON configuration to a website that is publicly accessible, for example in a Github Gist.

Run Sync Gateway

Run Sync Gateway again, this time using Couchbase Server as a backing store this time.

Replace http://yourserver.co/yourconfig.json with the URL where you’ve uploaded your JSON configuration from the previous step.

1
gce:~$ sudo docker run -d -name sg -p 4984:4984 -p 4985:4985 tleyden5iwx/couchbase-sync-gateway sync_gateway "http://yourserver.co/yourconfig.json"

This will return a container id, eg 9ffb83fd1f. Again, check the logs to make sure there are no serious errors in the logs:

1
gce:~$ sudo docker logs 9ffb83fd1f

You should see something along the lines of:

1
2
... 
02:23:58.913260 Starting server on :4984 ...

with no errors.

Verify it’s working

Save a document via curl

The easiest way to add a document is via the Admin port, since there is no authentication to worry about. Since we haven’t added a firewall rule to expose the admin port (4985), (and doing so without tight filtering would be a major security hole), the following command to create a new document must be run on the GCE instance.

1
gce:~$ curl -H "Content-Type: application/json" -d '{"such":"json"}' http://localhost:4985/sync_gateway/

If it worked, you should see a response like:

1
{"id":"3cbfbe43e76b7eb5c4c221a78b2cf0cc","ok":true,"rev":"1-cd809becc169215072fd567eebd8b8de"}

View document on Couchbase Server

To verify the document was successfully stored on Couchbase Server, you’ll need to login to the Couchbase Server Web Admin UI. There are instructions here on how to do that.

From there, navigate to Data Buckets / default / Documents, and you should see:

screenshot

Click on the document that has a UUID (eg, “29f8d7..” in the screenshot above), and you should see the document’s contents:

screenshot

The _sync metadata field is used internally by the Sync Gateway and can be ignored. The actual doc contents are towards the end of the file: .."such":"json"}