Its end of April and the pollen season is in fully bloom here in Chicago. I am on the edge of scratching my eyes out.

It could be worse. I could be living in Atlanta, where everything turns yellow.

Anyway, in my learnings with OpenStack, I was about to go and upload additional images into my lab. While I could build my own images, the nice thing with OpenStack is that I can upload AMI images. I went to the following page to retrieve additional images:

https://github.com/eucalyptus/eucalyptus/wiki/Starter-Images

There are other free AMI images out there, but I choose these because I can use Ubuntu’s built in tool to upload the images:


At openstack@openstackstorage:~/openstack/images$ cloud-publish-tarball opensuse-12.2-x86_64-emi.tar.gz images x86_64 Mon Apr 29 17:06:21 CDT 2013: ====== extracting image ====== kernel : kvm-kernel/vmlinuz-3.4.11-2.16-default ramdisk: kvm-kernel/initrd-3.4.11-2.16-default image : opensuse-12.2-x86_64-emi.img Mon Apr 29 17:07:12 CDT 2013: ====== bundle/upload kernel ====== Mon Apr 29 17:07:19 CDT 2013: ====== bundle/upload ramdisk ====== Mon Apr 29 17:07:23 CDT 2013: ====== bundle/upload image ====== Mon Apr 29 17:13:11 CDT 2013: ====== done ====== emi="ami-0000000b"; eri="ari-0000000a"; eki="aki-00000009";

Of course, I could always use euca2ools to upload individual images (in fact, that is what cloud-publish-tarball is - a wrapper around some of the euca2ools commands). However, the nice thing about the cloud-publish-tarball tool is thattakes care of uploading the images as well as the associated manifests.

At any event, once the images are upload, you will see them present via the nova image-list command:


stardust:openstack rilindo$ nova image-list
+--------------------------------------+-----------------------------------------------------+--------+--------+
| ID                                   | Name                                                | Status | Server |
+--------------------------------------+-----------------------------------------------------+--------+--------+
| 36acf23a-07e4-4253-8183-da60253d919a | images/centos-6.3-x86_64.img                        | ACTIVE |        |
| 1bba647b-111b-4ea7-a614-76229fd63c8c | images/initrd-2.6.32-279.14.1.el6.x86_64.img        | ACTIVE |        |
| 054b1f8b-1051-4fdd-b8a1-0efb442ab127 | images/initrd-3.4.11-2.16-default                   | ACTIVE |        |
| 756ee7b6-831b-4e30-899c-4b6aa2f3fafd | images/opensuse-12.2-x86_64-emi.img                 | ACTIVE |        |
| 6e63d0cd-4e24-4766-ad48-3a01670a607e | images/precise-server-cloudimg-i386-vmlinuz-virtual | ACTIVE |        |
| bedf0e78-c7d4-414e-85fb-291a0ccd851d | images/precise-server-cloudimg-i386.img             | ACTIVE |        |
| 490a92a6-5741-4485-8465-df9fc2c19a5c | images/vmlinuz-2.6.32-279.14.1.el6.x86_64           | ACTIVE |        |
| b9886335-e04a-4086-a860-852240430d53 | images/vmlinuz-3.4.11-2.16-default                  | ACTIVE |        |
+--------------------------------------+-----------------------------------------------------+--------+--------+

The first column is the actual file names of the images. Those are uploaded in the following directory:


root@openstack1:/var/lib/glance/images# ls -la
total 6125848
drwxr-xr-x 2 glance glance       4096 Apr 29 16:40 .
drwxr-xr-x 4 glance glance       4096 Apr 29 16:42 ..
-rw-rw-r-- 1 glance glance    5943048 Apr 29 16:18 1bba647b-111b-4ea7-a614-76229fd63c8c
-rw-rw-r-- 1 glance glance 4781506560 Apr 29 16:42 36acf23a-07e4-4253-8183-da60253d919a
-rw-rw-r-- 1 glance glance    3988752 Apr 29 16:18 490a92a6-5741-4485-8465-df9fc2c19a5c
-rw-rw-r-- 1 glance glance    5017344 Apr 22 18:24 6e63d0cd-4e24-4766-ad48-3a01670a607e
-rw-rw-r-- 1 glance glance 1476395008 Apr 22 18:26 bedf0e78-c7d4-414e-85fb-291a0ccd851d

And registered by glance-registry

Important point: When you do upload the image, CPU utilization for nova-api will temporarily sky-rocketed. On a server system, it would probably be fairly brief. On my desktop “server”, it took about 10-15 minutes


top - 16:34:51 up 3 days, 25 min,  1 user,  load average: 1.83, 1.30, 0.85
Tasks: 116 total,   2 running, 114 sleeping,   0 stopped,   0 zombie
Cpu(s): 97.0%us,  3.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   3791440k total,  3641552k used,   149888k free,   145980k buffers
Swap:  3928060k total,     3232k used,  3924828k free,  2797332k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                     
24048 nova      20   0  247m  86m 5764 R 95.2  2.3  10:09.51 nova-api                                                                                    
 1251 nova      20   0  205m  58m 4588 S  2.0  1.6  21:57.61 nova-network                                                                                
    3 root      20   0     0    0    0 S  0.3  0.0   2:06.05 ksoftirqd/0                                                                                 
   22 root      20   0     0    0    0 D  0.3  0.0   0:00.15 kswapd0                                                                                     
 1250 nova      20   0  197m  50m 4588 S  0.3  1.4  12:52.87 nova-scheduler                                                                              
 1252 nova      20   0  195m  49m 4588 S  0.3  1.3  12:08.12 nova-cert                                                                                   
 1522 mysql     20   0  870m  58m 7820 S  0.3  1.6  11:18.90 mysqld                                                                                      
 1757 rabbitmq  20   0  568m  29m 2284 S  0.3  0.8   5:44.25 beam                                                                                        
27080 root      20   0     0    0    0 S  0.3  0.0   0:00.05 kworker/0:1             

I couldn’t use the nova commands, as they will hang and wait until nova-api finishes. The first time it happened, I restarted nova-api, which killed the image registration, forcing me to delete and restart the image upload. :( But eventually it finishes and after some inspection, I was able to build my Fedora, CentOS 6 and OpenSuSE instances.


Linux vmi012 3.4.11-2.16-default #1 SMP Wed Sep 26 17:05:00 UTC 2012 (259fc87) x86_64 x86_64 x86_64 GNU/Linux
stardust:openstack rilindo$ ssh -i mykey.pem -lroot 192.168.15.16 uname -an
Linux vmi013.novalocal 2.6.32-279.14.1.el6.x86_64 #1 SMP Tue Nov 6 23:43:09 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
stardust:openstack rilindo$ ssh -i mykey.pem -lroot 192.168.15.13 uname -an

Next: Mac OS X and maybe Keystone

Long time, no write. This will change immediately.

Recently, I have getting myself familiar with OpenStack by way of the book OpenStack Cloud Computing Cookbook. For the most part, it was an easy read and I got my home OpenStack environment and running, albeit with some issues.

The book starts up with asking the reader to setup up a couple of Ubuntu 12.04 using VirtualBox. I decided to be clever and setup KVM instances instead, with fairly mixed results (At some point during the exploration, I decided to started over and went with some spare physical hardware  - just as well, I needed to redo my private lab anyway).

Next, I went and installed the following prerequisites per instruction on one single server:


sudo apt-get -y install rabbitmq-server nova-api nova-objectstore nova-scheduler nova-network nova-compute nova-cert glance qemu unzip.

Which installed OpenStack Essex, which is the default on Ubuntu 12.04. 

Then I setup  pressed parameters for MySQL server. Originally, it was MySQL 5.1

 
cat MYSQL_PRESEED | debconf-set-selections<
mysql-server-5.1 mysql-server/root_password password openstack
mysql-server-5.1 mysql-server/root_password_again password openstack
mysql-server-5.1 mysql-server/start_on_boot boolean true
MYSQL_PRESEED

However, Ubuntu 12.04.02 (which is what I am using) apparently install 5.5 by default, so some quick changes was in order:


cat «MYSQL_PRESEED | debconf-set-selections
mysql-server-5.5 mysql-server/root_password password openstack
mysql-server-5.5 mysql-server/root_password_again password openstack
mysql-server-5.5 mysql-server/start_on_boot boolean true
MYSQL_PRESEED

Then I changed the default config:

 
sudo apt-get update
sudo apt-get -y install mysql-server
sudo sed -i 's/127.0.0.1/0.0.0.0/g' /etc/mysql/my.cnf

Then reset the default password:

 
MYSQL_PASS=openstack
mysql -uroot -p$MYSQL_PASS -e 'CREATE DATABASE nova;'
mysql -uroot -p$MYSQL_PASS -e "GRANT ALL PRIVILEGES ON nova.* TO 'nova'@'%'"
mysql -uroot -p$MYSQL_PASS -e "SET PASSWORD FOR 'nova'@'%' = PASSWORD('$MYSQL_PASS');"

Then I updated /etc/nova/nova.conf with the mySQL password credentials:

 
--sql_connection=mysql://nova:openstack@192.168.15.105/nova

The IP .105 being the IP of the Ubuntu box openstack1.

Then I added the next set of parameters:

 
--use_deprecated_auth
--s3_host=192.168.15.105
--rabbit_host=192.168.15.105
--ec2_host=192.168.15.105
--ec2_dmz_host=192.168.15.105
--public_interface=eth1
--image_service=nova.image.glance.GlanceImageService
--glance_api_servers=192.168.15.105:9292
--auto_assign_floating_ip=true
--scheduler_default_filters=AllHostsFilter

(incidentally, I am not getting into detail as the parameters - you can look them up at docs.openstack.org).

Then I sync the configs into the database.

 
sudo nova-manage db sync

Then i setup the network ranges.

 
sudo nova-manage network create vmnet --fixed_range_v4=10.0.0.0/8 --network_size=64 --bridge_interface=eth0
sudo nova-manage floating create --ip_range=192.168.15.0/24

The first set is for administrative access VMs to talk to each other as well to OpenStack. The other set is the public facing IP range (or at least public from a user standpoint).

Then I stopped and started the following services:

nova-compute 

nova-network 

nova-api 

nova-scheduler 

nova-objectstore 

nova-cert

libvirt-bin 

glance-registry 

glance-api

By way of brief explanation, nova-compute creates and destroy the instances, while nova-manage assigns and creates the IPs and VLANs. The nova-api provides access to services by the various application components, nova-scheduler runs the commands submitted by the various services. 

I am not sure about nova-objectstore, but I am sure that nova-vert encrypts the connection between the various services. libvirt-bin, of course, is the wrapper around KVM and provide access to the KVM hypervisor. Finally, glance-registry and glance-api registers and manage the images. 

At this point, I went and created a user, gave it admin access, and then created a project called “cookbook”:

 
sudo nova-manage user admin openstack
sudo nova-manage role add openstack cloudadmin
sudo nova-manage project create cookbook openstack

Then I zip up the cook:

 
sudo nova-manage project zipfile cookbook openstack

Then installed the tools on my Ubuntu) client necessary to manage openstack


sudo apt-get install euca2ools python-novaclient unzip

(later one, I used the Mac OS X version, which I will get to later on).

Copied the zip file from openstack1, unzip it into a directory called openstack, cd to it, then source an environment file so that I got the parameters into the shell.


. novarc

Finally, I generated a key and inserted into the database


nova keypair-add openstack > openstack.pem  

chmod 0600 *.pem

This allows me to setup a SSH key to whatever the default user of the instance I am created, so that I can simply run:


ssh -i openstack.pem username@instance name.

(the book had it was chmod 0600.pem, BTW, which is an obvious typo)

Finally, I was ready to update my image. So I downloaded a cloud version of Ubuntu’s server:


wget http://uec-images.ubuntu.com/releases/precise/release/ubuntu-12.04-server-cloudimg-i386.tar.gz

Installed the cloud_util tools:


sudo apt-get install cloud_utils

And then I attempted to upload the image …


cloud-publish-tarball ubuntu-12.04-server-cloudimg-i386.tar.gz images i386

and I ran into problems - kept running of space. I though the destination was the issue, so I re-did the server (at the time, it was a virtual server, not a physical). Eventually, turns out that it is running out space at the source - I had installed the Ubuntu server partitioned with separate file systems for /, /usr, /var, /tmp. By default, it extracts to /tmp, which was too small. So I change the default in the shell to:


TMPDIR=/var/tmp
TEMPDIR=/var/tmp
export TMPDIR ; export TEMPDIR

And afterwards, I was able to upload the image, which I was able to view with:


+———————————————————+——————————————————————————-+————+————+

| ID                                   | Name                                                | Status | Server |

+———————————————————+——————————————————————————-+————+————+

| 6e63d0cd-4e24-4766-ad48-3a01670a607e | images/precise-server-cloudimg-i386-vmlinuz-virtual | ACTIVE |        |

| bedf0e78-c7d4-414e-85fb-291a0ccd851d | images/precise-server-cloudimg-i386.img             | ACTIVE |        |

+———————————————————+——————————————————————————-+————+————+

So I ready to build an instance. 

I added the appropriate access to the ports:


nova secgroup-add-rule default tcp 22 22 0.0.0.0/0 
nova secgroup-add-rule default icmp -1 -1 0.0.0.0/0

And then I went and “boot” or create my instances:


nova boot myInstance —image 0e2f43a8-e614-48ff-92bd-be0c68da19f4 —flavor 2 —key_name openstack 

And… it didn’t quite work. I mean, I can see the console output with:


nova console-log myInstance

But no IP was assigned, so I couldn’t login. After much reason, I found that the firewall rules was proving DHCP request from going through, so I added:


iptables -A POSTROUTING -t mangle -p udp —dport 68 -j CHECKSUM —checksum-fill

And from that point on, DHCP requests was going through and I was able to login.

At this point, I felt confident and decided to skip ahead and added another node to the group. 

Big mistake. But I am getting ahead of myself.

I went to chapter 11 and per instruction, installed just the following:


sudo apt-get -y installed nova-compute nova-network nova-apit

Copied the nova.conf to the new node, updating the IPs, verified on the original node that the services are listening:


root@openstack1:~# nova-manage service list

2013-04-27 22:38:46 DEBUG nova.utils [req-9dd11667-7967-4221-807f-98ddaf9371b3 None None] backend  from (pid=18080) __get_backend /usr/lib/python2.7/dist-packages/nova/utils.py:662

Binary           Host                                 Zone             Status     State Updated_At

nova-cert        openstack1                           nova             enabled    :-)   2013-04-28 03:38:45

nova-scheduler   openstack1                           nova             enabled    :-)   2013-04-28 03:38:44

nova-network     openstack1                           nova             enabled    :-)   2013-04-28 03:38:45

nova-compute     openstack1                           nova             enabled    :-)   2013-04-28 03:38:38

nova-compute     openstack2                           nova             enabled    :-)   2013-04-28 03:38:39

nova-network     openstack2                           nova             enabled    :-)   2013-04-26 16:58:30

Feeling pretty sure that it will work as intended, I attempted to create more instances - and I failed. IPs failed to be assigned once again.

First of all, I didn’t setup my switch properly to handle the private network between the new nodes. I forgot my switch can only separate ports into separate broadcast domains using VLAN tagging (which OpenStack uses by default by way of VLAN Manager, which I didn’t pay much attention to - this will become significant shortly in this blog). After a while, I just gave up and plugged a crossover between openstack1 and openstack2 (later one, I did remember how to setup the switch properly and got the packets tagged appropriately). 

So at that point, I was able to create the instances. But then I couldn’t login using my private keys. Reviewing my console log, I found this (example pulled from google):


‘http://169.254.169.254/2009-04-04/meta-data/instance-id’ failed [50/120s]:  url error [timed out]

When you build an instance, it pulls the credentials from this loop back, which in turns is supposed to be routed to the API on the controller (which is on openstack1). I corrected that problem by adding the route:


sudo ip route add 169.254.0.0/16 metric 1000 dev eth1

openstack@openstack2:~$ ip route show

default via 192.168.15.1 dev eth1  metric 100 

169.254.0.0/16 dev eth1  scope link  metric 1000 

192.168.15.0/24 dev eth1  proto kernel  scope link  src 192.168.15.110 

192.168.122.0/24 dev virbr0  proto kernel  scope link  src 192.168.122.1 

But the problem still persists. 

Finally, after several long looks at the following pages:

http://www.mirantis.com/blog/openstack-networking-single-host-flatdhcpmanager/

http://www.mirantis.com/blog/openstack-networking-flatmanager-and-flatdhcpmanager/

And going back and reading at chapter 10 in the book (the chapter I skipped) I uninstalled nova-network. And suddenly, the instances are able to reach the API now.

Remember how the default OpenStack setup was using VLAN Manager? That is useful if you need to separate tenants using IP ranges and VLANs. More importantly, it meant that nova-network is only needed on the controller side (since the controller handles the route and IP. It is only when I need to use the other networking setup (Flat Networking or Flat Networking with DHC -  where I isolate the tenants using Security Group Modes) is nova-network is necessary on the new nodes. Otherwise, the firewalls setup by nova-network was blocking access to the  API by the instances.

So that was resolved and I was able to build instances at will - well mostly. Just tried to create a new instance just now. and I get this:


ERROR: Quota exceeded: code=InstanceLimitExceeded (HTTP 413) (Request-ID: req-82aa560f-0318-4e55-b5bd-98b10d1b9c60)

Heh. Removing one instance now:



stardust:openstack rilindo$ nova delete vmi001

stardust:openstack rilindo$ nova list

+———————————————————+————+————+————————————————+

| ID                                   | Name   | Status | Networks                       |

+———————————————————+————+————+————————————————+

| 7a6a4912-71ce-45e2-8a38-51537a4c7ffb | vmi001 | ACTIVE | vmnet=10.0.0.4, 192.168.15.12  |

| 2fbaad51-35f4-4ac4-b6a3-d338af5905d8 | vmi002 | ACTIVE | vmnet=10.0.0.7, 192.168.15.13  |

| c70c7522-c8fb-4196-8161-6c6153549729 | vmi003 | ACTIVE | vmnet=10.0.0.8, 192.168.15.16  |

| 9899b0ee-4cac-4588-891b-705a6cc95512 | vmi004 | ACTIVE | vmnet=10.0.0.9, 192.168.15.17  |

| 41db335c-97a2-4335-9e91-7a0a4345b70a | vmi005 | ACTIVE | vmnet=10.0.0.10, 192.168.15.18 |

| 4d2698e2-88ba-42f6-9107-0380d89c3e89 | vmi006 | ACTIVE | vmnet=10.0.0.11, 192.168.15.19 |

| 9cf60a41-1a58-4dbd-a47c-b4f2c42e6117 | vmi007 | ACTIVE | vmnet=10.0.0.12, 192.168.15.24 |

| de257ab1-cf03-4f08-816f-c9e16ce2793a | vmi008 | ACTIVE | vmnet=10.0.0.13, 192.168.15.25 |

| 0c67d3bc-1ca1-4cfe-9653-eb3a5c94350f | vmi009 | ACTIVE | vmnet=10.0.0.14, 192.168.15.26 |

| 9b5601bf-969f-447f-9e2a-8727dd3d45e2 | vmi010 | ACTIVE | vmnet=10.0.0.15, 192.168.15.27 |

+———————————————————+————+————+————————————————+

stardust:openstack rilindo$ nova list

+———————————————————+————+————+————————————————+

| ID                                   | Name   | Status | Networks                       |

+———————————————————+————+————+————————————————+

| 2fbaad51-35f4-4ac4-b6a3-d338af5905d8 | vmi002 | ACTIVE | vmnet=10.0.0.7, 192.168.15.13  |

| c70c7522-c8fb-4196-8161-6c6153549729 | vmi003 | ACTIVE | vmnet=10.0.0.8, 192.168.15.16  |

| 9899b0ee-4cac-4588-891b-705a6cc95512 | vmi004 | ACTIVE | vmnet=10.0.0.9, 192.168.15.17  |

| 41db335c-97a2-4335-9e91-7a0a4345b70a | vmi005 | ACTIVE | vmnet=10.0.0.10, 192.168.15.18 |

| 4d2698e2-88ba-42f6-9107-0380d89c3e89 | vmi006 | ACTIVE | vmnet=10.0.0.11, 192.168.15.19 |

| 9cf60a41-1a58-4dbd-a47c-b4f2c42e6117 | vmi007 | ACTIVE | vmnet=10.0.0.12, 192.168.15.24 |

| de257ab1-cf03-4f08-816f-c9e16ce2793a | vmi008 | ACTIVE | vmnet=10.0.0.13, 192.168.15.25 |

| 0c67d3bc-1ca1-4cfe-9653-eb3a5c94350f | vmi009 | ACTIVE | vmnet=10.0.0.14, 192.168.15.26 |

| 9b5601bf-969f-447f-9e2a-8727dd3d45e2 | vmi010 | ACTIVE | vmnet=10.0.0.15, 192.168.15.27 |

+———————————————————+————+————+————————————————+

And adding a new one:

stardust:openstack rilindo$ nova boot vmi011 —image bedf0e78-c7d4-414e-85fb-291a0ccd851d —flavor 2 —key_name mykey

+——————————————————-+—————————————————————————————+

| Property                            | Value                                                    |

+——————————————————-+—————————————————————————————+

| status                              | BUILD                                                    |

| updated                             | 2013-04-28T04:02:23Z                                     |

| OS-EXT-STS:task_state               | scheduling                                               |

| OS-EXT-SRV-ATTR:host                | openstack2                                               |

| key_name                            | mykey                                                    |

| image                               | images/precise-server-cloudimg-i386.img                  |

| hostId                              | e1693fb6dfb89d758273a4312096678745f8f568dbdc3fbe279e286b |

| OS-EXT-STS:vm_state                 | building                                                 |

| OS-EXT-SRV-ATTR:instance_name       | instance-00000046                                        |

| OS-EXT-SRV-ATTR:hypervisor_hostname | None                                                     |

| flavor                              | m1.small                                                 |

| id                                  | 52cc6451-1c5c-462e-8d06-6f9465ce1a94                     |

| user_id                             | openstack                                                |

| name                                | vmi011                                                   |

| adminPass                           | RVeaJd7j7Kmu                                             |

| tenant_id                           | cookbook                                                 |

| created                             | 2013-04-28T04:02:22Z                                     |

| OS-DCF:diskConfig                   | MANUAL                                                   |

| accessIPv4                          |                                                          |

| accessIPv6                          |                                                          |

| progress                            | 0                                                        |

| OS-EXT-STS:power_state              | 0                                                        |

| metadata                            | {}                                                       |

| config_drive                        |                                                          |

+——————————————————-+—————————————————————————————+

stardust:openstack rilindo$ nova list

+———————————————————+————+————+————————————————+

| ID                                   | Name   | Status | Networks                       |

+———————————————————+————+————+————————————————+

| 2fbaad51-35f4-4ac4-b6a3-d338af5905d8 | vmi002 | ACTIVE | vmnet=10.0.0.7, 192.168.15.13  |

| c70c7522-c8fb-4196-8161-6c6153549729 | vmi003 | ACTIVE | vmnet=10.0.0.8, 192.168.15.16  |

| 9899b0ee-4cac-4588-891b-705a6cc95512 | vmi004 | ACTIVE | vmnet=10.0.0.9, 192.168.15.17  |

| 41db335c-97a2-4335-9e91-7a0a4345b70a | vmi005 | ACTIVE | vmnet=10.0.0.10, 192.168.15.18 |

| 4d2698e2-88ba-42f6-9107-0380d89c3e89 | vmi006 | ACTIVE | vmnet=10.0.0.11, 192.168.15.19 |

| 9cf60a41-1a58-4dbd-a47c-b4f2c42e6117 | vmi007 | ACTIVE | vmnet=10.0.0.12, 192.168.15.24 |

| de257ab1-cf03-4f08-816f-c9e16ce2793a | vmi008 | ACTIVE | vmnet=10.0.0.13, 192.168.15.25 |

| 0c67d3bc-1ca1-4cfe-9653-eb3a5c94350f | vmi009 | ACTIVE | vmnet=10.0.0.14, 192.168.15.26 |

| 9b5601bf-969f-447f-9e2a-8727dd3d45e2 | vmi010 | ACTIVE | vmnet=10.0.0.15, 192.168.15.27 |

| 52cc6451-1c5c-462e-8d06-6f9465ce1a94 | vmi011 | ACTIVE | vmnet=10.0.0.4, 192.168.15.12  |

+———————————————————+————+————+————————————————+

stardust:openstack rilindo$ ssh -i mykey.pem ubuntu@192.168.15.12

The authenticity of host ‘192.168.15.12 (192.168.15.12)’ can’t be established.

RSA key fingerprint is 85:01:59:17:8d:11:4b:7c:60:72:c8:09:be:2d:45:73.

Are you sure you want to continue connecting (yes/no)? yes

Warning: Permanently added ‘192.168.15.12’ (RSA) to the list of known hosts.

Welcome to Ubuntu 12.04.2 LTS (GNU/Linux 3.2.0-40-virtual i686)

 * Documentation:  https://help.ubuntu.com/

  System information as of Sun Apr 28 04:17:07 UTC 2013

  System load:  0.0              Processes:           61

  Usage of /:   6.9% of 9.84GB   Users logged in:     0

  Memory usage: 1%               IP address for eth0: 10.0.0.4

  Swap usage:   0%

  Graph this data and manage this system at https://landscape.canonical.com/

  Get cloud support with Ubuntu Advantage Cloud Guest:

    http://www.ubuntu.com/business/services/cloud

  Use Juju to deploy your cloud instances and workloads:

    https://juju.ubuntu.com/#cloud-precise

0 packages can be updated.

0 updates are security updates.

The programs included with the Ubuntu system are free software;

the exact distribution terms for each program are described in the

individual files in /usr/share/doc/*/copyright.

Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by

applicable law.

To run a command as administrator (user “root”), use “sudo ”.

See “man sudo_root” for details.

ubuntu@vmi011:~$ 

URLs I used as reference:

http://www.mail-archive.com/openstack@lists.launchpad.net/msg20584.html

http://www.mirantis.com/blog/openstack-networking-single-host-flatdhcpmanager/

http://www.mirantis.com/blog/openstack-networking-flatmanager-and-flatdhcpmanager/

http://www.gossamer-threads.com/lists/openstack/dev/17681

http://howtobyexamples.blogspot.com/2011/10/openstack-novaconf-configuration.html

http://blog.tocisoft.com/2011/06/why-ec2-ami-tools-ec2-upload-bundle.html

http://docs.openstack.org/essex/openstack-compute/admin/content/configuring-networking-on-the-compute-node.html

https://answers.launchpad.net/nova/+question/215096

http://docs.openstack.org/essex/openstack-compute/admin/content/network-troubleshooting.html

http://docs.openstack.org/folsom/openstack-network/admin/content/adv_cfg_l3_agent_metadata.html

Note: The book was written for Openstack Essex. Since then, Flosom and now Grizzly was released. Which mean that much of the results I found was for the latter, making things more for me, as I wasn’t sure if the solution was applicable to Essex or not. (For example, nova-network was replaced by Quantum, which apparently does networking a bit diffently).

I hate having to manually set the hostname in kickstart file, so when I found a fix, I was very happy. I wish I can take credit, but it was originally made by somebody who was trying to figure out a way to automatically set the hostname for VMWare ESX machines. Unfortunately, I lost that link, so I can’t refer to the other page for credit. So the best I can do is to explain how it is done and hopefully I find that link later and update this post, so that the right person is properly attributed.

To explain how the solution works, its good to understand how Linux boots a system, which this article does a very good job of explaining. However, if you are impatient, this is short version:

  1. Computer turns on (DUH!)
  2. BIOS kick in, which performs POST, local device enumeration and initialization and then searches for active and bootable devices.
  3. Stage 1 (MBR) kicks in, looks for boot loader (in our case, GRUB)
  4. Grub (Stage 2) then loads kernel with an optional ramdisk.
  5. kernel boots, initializes and then starts init (or some other process) that then starts up other processes

Now with that mind, let’s take a look at our grub on jenkins:

[root@jenkins chef]# cat /etc/grub.conf 
# grub.conf generated by anaconda
#
# Note that you do not have to rerun grub after making changes to this file
# NOTICE:  You have a /boot partition.  This means that
#          all kernel and initrd paths are relative to /boot/, eg.
#          root (hd0,0)
#          kernel /vmlinuz-version ro root=/dev/mapper/vg_centos6-lv_root
#          initrd /initrd-[generic-]version.img
#boot=/dev/vda
default=0
timeout=5
splashimage=(hd0,0)/grub/splash.xpm.gz
hiddenmenu
title CentOS (2.6.32-220.2.1.el6.x86_64)
	root (hd0,0)
	kernel /vmlinuz-2.6.32-220.2.1.el6.x86_64 ro root=/dev/mapper/vg_centos6-lv_root rd_NO_LUKS LANG=en_US.UTF-8 rd_LVM_LV=vg_centos6/lv_swap rd_NO_MD quiet SYSFONT=latarcyrheb-sun16 rhgb rd_LVM_LV=vg_centos6/lv_root  KEYBOARDTYPE=pc KEYTABLE=us crashkernel=auto rhgb quiet rd_NO_DM
	initrd /initramfs-2.6.32-220.2.1.el6.x86_64.img

As you can see, it boots the kernel, as well as set parameters such as root file system, language, keyboard and others things needs for the system to boot up properly. That information is actually still available in the running kernel by viewing the following file:

[root@jenkins chef]# cat /proc/cmdline 
ro root=/dev/mapper/vg_centos6-lv_root rd_NO_LUKS LANG=en_US.UTF-8 rd_LVM_LV=vg_centos6/lv_swap rd_NO_MD quiet SYSFONT=latarcyrheb-sun16 rhgb rd_LVM_LV=vg_centos6/lv_root  KEYBOARDTYPE=pc KEYTABLE=us  rhgb quiet rd_NO_DM
[root@jenkins chef]

Notice that in this file, you will find the same parameters as you find in the grub.conf. In some ways, if init (at least on System-V systems) is the mother of all process, the kernel is the grandmother, quietly hidden in the background.

What if you were to pass a parameter that it doesn’t recognize? In most cases, it will probably ignore it, but it will still in the kernel list. So lets insert:

FOO=BAR

to the kernel line right between “crashkernel=auto” and “rhgb” (either in grub or at kernel line at boot loader page during stage 2):

kernel /vmlinuz-2.6.32-220.2.1.el6.x86_64 ro root=/dev/mapper/vg_centos6-lv_root rd_NO_LUKS LANG=en_US.UTF-8 rd_LVM_LV=vg_centos6/lv_swap rd_NO_MD quiet SYSFONT=latarcyrheb-sun16 rhgb rd_LVM_LV=vg_centos6/lv_root  KEYBOARDTYPE=pc KEYTABLE=us crashkernel=auto FOO=BAR rhgb quiet rd_NO_DM

Now lets view /proc/cmdline again:

[root@jenkins ~]# cat /proc/cmdline 
ro root=/dev/mapper/vg_centos6-lv_root rd_NO_LUKS LANG=en_US.UTF-8 rd_LVM_LV=vg_centos6/lv_swap rd_NO_MD quiet SYSFONT=latarcyrheb-sun16 rhgb rd_LVM_LV=vg_centos6/lv_root  KEYBOARDTYPE=pc KEYTABLE=us  FOO=BAR rhgb quiet rd_NO_DM
[root@jenkins ~]# 

As we can see, FOO=BAR is in there, with no ill effects to the system boot.

So why would we want to pass a value that the kernel doesn’t use? So that we can do this:

[rilindo@jenkins ~]$ for x in `cat /proc/cmdline`
> do
> case $x in FOO*)
> eval $x
> echo "${FOO}" 
> ;;
> esac
> done
BAR
[rilindo@jenkins ~]$ 

What this script does is to get the output of /proc/cmdline as a series of positional elements (think of it like a list or an array) and loop through it. Then we will test each element through a case statement and if it matches (in this case, FOO), then it evaluates it to a variable. We then echo that variable, which will then return a value. In other words, we look for a section that has “FOO”, and get “BAR” out of it.

That is essentially how we automatically set the hostname in our installation. Using this technique, we put this script in our %pre section of our kickstart: 

%pre
#!/bin/sh
for x in `cat /proc/cmdline`; do
        case $x in SERVERNAME*)
	        eval $x
		echo "network --device eth0 --bootproto dhcp --hostname ${SERVERNAME}.monzell.com" > /tmp/network.ks
                ;;
	        esac;
	done
%end

Here, we look for a value called SERVERNAME and evaluates that value into a variable. We will then echo the network setup with the variable (which we will use as part of the hostname setup) and redirect into the file under /tmp. Then we will include that file in our installation section:

At this point, we are essentially done. To use it, we just need to pass SERVERNAME=X (where X is the name of the hostname you want to set) in our kickstart setup. In our case, we build virtual machines with KVM via virt-install, so we pass that in the following line:

virt-install --name jenkins --disk path=/home/vms/jenkins,size=50,bus=virtio --vnc --noautoconsole --vcpus=1 --ram=512 --network bridge=br0,mac=52:54:00:91:95:30 --location=http://192.168.15.100/mirrors/centos/6.2/os/x86_64/ -x "ks=http://192.168.15.100/mirrors/ks/6.2/kvm/x86_64-Ruby-test.cfg SERVERNAME=jenkins"

Here is my entire kickstart file:

install
url --url http://192.168.15.100/mirrors/centos/6.2/os/x86_64/
lang en_US.UTF-8
keyboard us
text
%include /tmp/network.ks

rootpw  --iscrypted PUTPASSWORDHERE
firewall --service=ssh
authconfig --enableshadow --passalgo=sha512 --enablefingerprint
selinux --enforcing
timezone --utc America/New_York
bootloader --location=mbr --driveorder=vda --append="crashkernel=auto rhgb quiet"
clearpart --all --drives=vda --initlabel

part /boot --fstype=ext4 --size=500
part pv.EPlgaf-h1b4-YqDI-2wfs-3C7I-SPPt-Agk5O7 --grow --size=1

volgroup vg_centos6 --pesize=4096 pv.EPlgaf-h1b4-YqDI-2wfs-3C7I-SPPt-Agk5O7
logvol / --fstype=ext4 --name=lv_root --vgname=vg_centos6 --grow --size=1024 --maxsize=51200
logvol swap --name=lv_swap --vgname=vg_centos6 --grow --size=1008 --maxsize=2016

repo --name="Local CentOS 6 - x86_64"  --baseurl=http://192.168.15.100/mirrors/centos/6.2/os/x86_64
repo --name="Local CentOS 6 - x86_64 - Updates"  --baseurl=http://192.168.15.100/mirrors/centos/6.2/updates/x86_64
repo --name="Local Custom Installs" --baseurl=http://192.168.15.100/mirrors/customrepos/centos/x86_64

%packages
@base
@console-internet
@core
@debugging
@directory-client
@hardware-monitoring
@large-systems
@network-file-system-client
@performance
@perl-runtime
@scalable-file-systems
@server-platform
gcc
gcc-c++
pax
oddjob
sgpio
certmonger
pam_krb5
krb5-workstation
nscd
pam_ldap
nss-pam-ldapd
perl-DBD-SQLite
ruby-1.9.3p0
rubygems-1.8.12
%end

%pre
#!/bin/sh
for x in `cat /proc/cmdline`; do
        case $x in SERVERNAME*)
	        eval $x
		echo "network --device eth0 --bootproto dhcp --hostname ${SERVERNAME}.example.com" > /tmp/network.ks
                ;;
	        esac;
	done
%end

%post --log=/root/my-post-log

setsebool -P use_nfs_home_dirs on
mkdir /home/users
mkdir /etc/chef

URLPOSTCONF="http://192.168.15.100/mirrors/ks"
curl ${URLPOSTCONF}/6.2/repos/CentOS-Custom.repo -o /etc/yum.repos.d/CentOS-Custom.repo
curl ${URLPOSTCONF}/6.2/autofs/auto.master -o /etc/auto.master
curl ${URLPOSTCONF}/6.2/autofs/auto.home -o /etc/auto.home
curl ${URLPOSTCONF}/keys/cacert.pem -o /etc/openldap/cacerts/cacert.pem


curl ${URLPOSTCONF}/chef/validation.pem -o /etc/chef/validation.pem
curl ${URLPOSTCONF}/chef/client.rb -o /etc/chef/client.rb
curl ${URLPOSTCONF}/chef/first-run.json -o /etc/chef/first-run.json
rpm --import ${URLPOSTCONF}/keys/legacy.key
rpm --import ${URLPOSTCONF}/keys/custom.key

authconfig --enablesssd --enableldap --enableldaptls --ldapserver=kerberos.monzell.com --ldapbasedn="dc=monzell,dc=com" --enableldapauth --update

echo "nameserver 192.168.15.57" >> /etc/resolv.conf
echo "nameserver 192.168.15.71" >> /etc/resolv.conf

gem install chef
chef-client -j /etc/chef/first-run.json
chkconfig chef-client on
chkconfig rpcbind on
chkconfig sssd on
chkconfig ntpd on
sync


%end

reboot

Let me know if this is useful. And again, I didn’t originally came up with this, so I plead innocent to charges of plagiarism. :)

Found the solution.

Essentially, I just need to add this:

supports :status => true, :restart => true, :reload => true

This means that it will start up the service if isn’t running. Now it works as expected when I add chef-client to the run list.

Here is the updated code:

 

when "bsd"
  case node['platform']
    when "freebsd"

      directory "/etc/rc.conf.d" do
        owner "root"
        group "wheel"
        mode "0644"
        action :create
      end
      template "/etc/rc.d/chef-client" do
        source "#{dist_dir}/rc.d/chef-client.erb"
        owner "root"
        group "wheel"
        mode 0755
      end

      template "/etc/rc.conf.d/chef" do
        source "#{dist_dir}/rc.conf.d/chef.erb"
        mode 0644
        notifies :start, "service[chef-client]", :delayed
      end

      service "chef-client" do
        supports :status => true, :restart => true, :reload => true
        action [:start]
      end

    else
      log "You specified service style 'bsd'. You will need to set up your rc.local file."
      log "Hint: chef-client -i #{node["chef_client"]["client_interval"]} -s #{node["chef_client"]["client_splay"]}"
  end
else
  log "Could not determine service init style, manual intervention required to start up the chef-client service."
end

Finally changed chef-client with an updated recipe to support FreeBSD.

Under the chef-repo/chef-client directory, I added the following files:

./templates/freebsd/rc.d/chef-client.erb
./templates/freebsd/rc.conf.d/chef.erb

And updated:

./recipes/service.rb

The locations corresponds to the directory location under the default #{conf} directory, (which is apparently /etc) The templates are .erb files that corresponds to the configuration files on the server.

chef-client.erb:

[rilindo@chef chef-client]$ cat ./templates/freebsd/rc.d/chef-client.erb 
#!/bin/sh

# PROVIDE: chef
# REQUIRE: LOGIN
# KEYWORD: nojail shutdown

. /etc/rc.subr

name="chef"
rcvar=`set_rcvar`
stop_cmd="chef_stop"
command="/usr/local/bin/${name}-client"
command_args="-i -s -d -L /var/log/chef/client.log -c /etc/chef/client.rb -P /var/run/chef.pid"
load_rc_config $name
export rc_pid
chef_stop()
{
	pidfile="/var/run/chef.pid"
	rc_pid=`cat ${pidfile}`
        kill $rc_pid
}

run_rc_command "$1"

chef.erb

[rilindo@chef chef-client]$ cat ./templates/freebsd/rc.conf.d/chef
chef_enable="YES"

 With ERB, I could have easily have placeholders in the code so that it can be populated with node-specific information automatically. I did not do that in this case, though. That is for another time.

Finally, I updated the service code from:

when "bsd"
  log "You specified service style 'bsd'. You will need to set up your rc.local file."
  log "Hint: chef-client -i #{node["chef_client"]["client_interval"]} -s #{node["chef_client"]["client_splay"]}"
  
else
  log "Could not determine service init style, manual intervention required to start up the chef-client service."
end

to

when "bsd"
  case node['platform']
    when "freebsd"

      directory "/etc/rc.conf.d" do
        owner "root"
        group "wheel"
        mode "0644"
        action :create
      end
      template "/etc/rc.d/chef-client" do
        source "#{dist_dir}/rc.d/chef-client.erb"
        owner "root"
        group "wheel"
        mode 0755
      end

      template "/etc/rc.conf.d/chef" do
        source "#{dist_dir}/rc.conf.d/chef.erb"
        mode 0644
        notifies :start, "service[chef-client]", :delayed
      end

      service "chef-client" do
        action [:start]
      end

    else
      log "You specified service style 'bsd'. You will need to set up your rc.local file."
      log "Hint: chef-client -i #{node["chef_client"]["client_interval"]} -s #{node["chef_client"]["client_splay"]}"
  end
else
  log "Could not determine service init style, manual intervention required to start up the chef-client service."
end

I am not sure if this is quite “rubyish”, but it works. 

At that point, I uploaded the cookbook:

knife cookbook upload chef-client

Added the recipe to the freebsd node:

knife node run_list add freebsddev.monzell.com  "recipe[chef-client]"

And ran chef-client. The chef-client program sees the receipe and install the files to the appropriate locations:

freebsddev# /usr/local/bin/chef-client
[Sun, 01 Jan 2012 23:48:28 -0500] INFO: *** Chef 0.10.8 ***
[Sun, 01 Jan 2012 23:48:34 -0500] INFO: Run List is [recipe[chef-client]]
[Sun, 01 Jan 2012 23:48:34 -0500] INFO: Run List expands to [chef-client]
[Sun, 01 Jan 2012 23:48:34 -0500] INFO: Starting Chef Run for freebsddev.monzell.com
[Sun, 01 Jan 2012 23:48:34 -0500] INFO: Running start handlers
[Sun, 01 Jan 2012 23:48:34 -0500] INFO: Start handlers complete.
[Sun, 01 Jan 2012 23:48:34 -0500] INFO: Loading cookbooks [chef-client]
[Sun, 01 Jan 2012 23:48:34 -0500] INFO: Storing updated cookbooks/chef-client/recipes/default.rb in the cache.
[Sun, 01 Jan 2012 23:48:34 -0500] INFO: Storing updated cookbooks/chef-client/recipes/delete_validation.rb in the cache.
[Sun, 01 Jan 2012 23:48:34 -0500] INFO: Storing updated cookbooks/chef-client/recipes/service.rb in the cache.
[Sun, 01 Jan 2012 23:48:35 -0500] INFO: Storing updated cookbooks/chef-client/recipes/config.rb in the cache.
[Sun, 01 Jan 2012 23:48:35 -0500] INFO: Storing updated cookbooks/chef-client/attributes/default.rb in the cache.
[Sun, 01 Jan 2012 23:48:35 -0500] INFO: Storing updated cookbooks/chef-client/metadata.json in the cache.
[Sun, 01 Jan 2012 23:48:35 -0500] INFO: Storing updated cookbooks/chef-client/README.md in the cache.
[Sun, 01 Jan 2012 23:48:35 -0500] INFO: Storing updated cookbooks/chef-client/metadata.rb in the cache.
[Sun, 01 Jan 2012 23:48:35 -0500] INFO: Processing directory[/var/run] action create (chef-client::service line 42)
[Sun, 01 Jan 2012 23:48:35 -0500] INFO: Processing directory[/var/chef/cache] action create (chef-client::service line 42)
[Sun, 01 Jan 2012 23:48:35 -0500] INFO: Processing directory[/var/chef/backup] action create (chef-client::service line 42)
[Sun, 01 Jan 2012 23:48:35 -0500] INFO: Processing directory[/var/log/chef] action create (chef-client::service line 42)
[Sun, 01 Jan 2012 23:48:35 -0500] INFO: Processing directory[/etc/rc.conf.d] action create (chef-client::service line 203)
[Sun, 01 Jan 2012 23:48:35 -0500] INFO: Processing template[/etc/rc.d/chef-client] action create (chef-client::service line 209)
[Sun, 01 Jan 2012 23:48:35 -0500] INFO: Processing template[/etc/rc.conf.d/chef] action create (chef-client::service line 216)
[Sun, 01 Jan 2012 23:48:36 -0500] INFO: Processing service[chef-client] action start (chef-client::service line 222)
[Sun, 01 Jan 2012 23:48:36 -0500] INFO: Chef Run complete in 2.340431224 seconds
[Sun, 01 Jan 2012 23:48:36 -0500] INFO: Running report handlers
[Sun, 01 Jan 2012 23:48:36 -0500] INFO: Report handlers complete

I am mostly done now. I just need to start it up with:

/etc/rc.d/chef-client start

I should be able to start automatically. However,  getting it to start up automatically upon installation has so far just returns me with:

freebsddev# /usr/local/bin/chef-client
[Sun, 01 Jan 2012 23:46:26 -0500] INFO: *** Chef 0.10.8 ***
[Sun, 01 Jan 2012 23:46:32 -0500] INFO: Run List is [recipe[chef-client]]
[Sun, 01 Jan 2012 23:46:32 -0500] INFO: Run List expands to [chef-client]
[Sun, 01 Jan 2012 23:46:32 -0500] INFO: Starting Chef Run for freebsddev.monzell.com
[Sun, 01 Jan 2012 23:46:32 -0500] INFO: Running start handlers
[Sun, 01 Jan 2012 23:46:32 -0500] INFO: Start handlers complete.
[Sun, 01 Jan 2012 23:46:32 -0500] INFO: Loading cookbooks [chef-client]
[Sun, 01 Jan 2012 23:46:33 -0500] INFO: Storing updated cookbooks/chef-client/recipes/default.rb in the cache.
[Sun, 01 Jan 2012 23:46:33 -0500] INFO: Storing updated cookbooks/chef-client/recipes/delete_validation.rb in the cache.
[Sun, 01 Jan 2012 23:46:33 -0500] INFO: Storing updated cookbooks/chef-client/recipes/service.rb in the cache.
[Sun, 01 Jan 2012 23:46:33 -0500] INFO: Storing updated cookbooks/chef-client/recipes/config.rb in the cache.
[Sun, 01 Jan 2012 23:46:33 -0500] INFO: Storing updated cookbooks/chef-client/attributes/default.rb in the cache.
[Sun, 01 Jan 2012 23:46:34 -0500] INFO: Storing updated cookbooks/chef-client/metadata.json in the cache.
[Sun, 01 Jan 2012 23:46:34 -0500] INFO: Storing updated cookbooks/chef-client/README.md in the cache.
[Sun, 01 Jan 2012 23:46:35 -0500] INFO: Storing updated cookbooks/chef-client/metadata.rb in the cache.
[Sun, 01 Jan 2012 23:46:35 -0500] INFO: Processing directory[/var/run] action create (chef-client::service line 42)
[Sun, 01 Jan 2012 23:46:35 -0500] INFO: Processing directory[/var/chef/cache] action create (chef-client::service line 42)
[Sun, 01 Jan 2012 23:46:35 -0500] INFO: Processing directory[/var/chef/backup] action create (chef-client::service line 42)
[Sun, 01 Jan 2012 23:46:35 -0500] INFO: Processing directory[/var/log/chef] action create (chef-client::service line 42)
[Sun, 01 Jan 2012 23:46:35 -0500] INFO: Processing directory[/etc/rc.conf.d] action create (chef-client::service line 203)
[Sun, 01 Jan 2012 23:46:35 -0500] INFO: Processing template[/etc/rc.d/chef-client] action create (chef-client::service line 209)
[Sun, 01 Jan 2012 23:46:35 -0500] INFO: Processing template[/etc/rc.conf.d/chef] action create (chef-client::service line 216)
[Sun, 01 Jan 2012 23:46:35 -0500] INFO: Processing service[chef-client] action restart (chef-client::service line 222)
[Sun, 01 Jan 2012 23:46:36 -0500] ERROR: service[chef-client] (chef-client::service line 222) has had an error
[Sun, 01 Jan 2012 23:46:36 -0500] ERROR: service[chef-client] (/var/chef/cache/cookbooks/chef-client/recipes/service.rb:222:in `from_file') had an error:
service[chef-client] (chef-client::service line 222) had an error: Chef::Exceptions::Exec: /etc/rc.d/chef-client stop returned 1, expected 0
/usr/local/lib/ruby/gems/1.9/gems/chef-0.10.8/lib/chef/mixin/command.rb:127:in `handle_command_failures'
/usr/local/lib/ruby/gems/1.9/gems/chef-0.10.8/lib/chef/mixin/command.rb:74:in `run_command'
/usr/local/lib/ruby/gems/1.9/gems/chef-0.10.8/lib/chef/provider/service/init.rb:45:in `stop_service'
/usr/local/lib/ruby/gems/1.9/gems/chef-0.10.8/lib/chef/provider/service/init.rb:55:in `restart_service'
/usr/local/lib/ruby/gems/1.9/gems/chef-0.10.8/lib/chef/provider/service.rb:78:in `action_restart'
/usr/local/lib/ruby/gems/1.9/gems/chef-0.10.8/lib/chef/resource.rb:440:in `run_action'
/usr/local/lib/ruby/gems/1.9/gems/chef-0.10.8/lib/chef/runner.rb:45:in `run_action'
/usr/local/lib/ruby/gems/1.9/gems/chef-0.10.8/lib/chef/runner.rb:81:in `block (2 levels) in converge'
/usr/local/lib/ruby/gems/1.9/gems/chef-0.10.8/lib/chef/runner.rb:81:in `each'
/usr/local/lib/ruby/gems/1.9/gems/chef-0.10.8/lib/chef/runner.rb:81:in `block in converge'
/usr/local/lib/ruby/gems/1.9/gems/chef-0.10.8/lib/chef/resource_collection.rb:94:in `block in execute_each_resource'
/usr/local/lib/ruby/gems/1.9/gems/chef-0.10.8/lib/chef/resource_collection/stepable_iterator.rb:116:in `call'
/usr/local/lib/ruby/gems/1.9/gems/chef-0.10.8/lib/chef/resource_collection/stepable_iterator.rb:116:in `call_iterator_block'
/usr/local/lib/ruby/gems/1.9/gems/chef-0.10.8/lib/chef/resource_collection/stepable_iterator.rb:85:in `step'
/usr/local/lib/ruby/gems/1.9/gems/chef-0.10.8/lib/chef/resource_collection/stepable_iterator.rb:104:in `iterate'
/usr/local/lib/ruby/gems/1.9/gems/chef-0.10.8/lib/chef/resource_collection/stepable_iterator.rb:55:in `each_with_index'
/usr/local/lib/ruby/gems/1.9/gems/chef-0.10.8/lib/chef/resource_collection.rb:92:in `execute_each_resource'
/usr/local/lib/ruby/gems/1.9/gems/chef-0.10.8/lib/chef/runner.rb:76:in `converge'
/usr/local/lib/ruby/gems/1.9/gems/chef-0.10.8/lib/chef/client.rb:312:in `converge'
/usr/local/lib/ruby/gems/1.9/gems/chef-0.10.8/lib/chef/client.rb:160:in `run'
/usr/local/lib/ruby/gems/1.9/gems/chef-0.10.8/lib/chef/application/client.rb:239:in `block in run_application'
/usr/local/lib/ruby/gems/1.9/gems/chef-0.10.8/lib/chef/application/client.rb:229:in `loop'
/usr/local/lib/ruby/gems/1.9/gems/chef-0.10.8/lib/chef/application/client.rb:229:in `run_application'
/usr/local/lib/ruby/gems/1.9/gems/chef-0.10.8/lib/chef/application.rb:67:in `run'
/usr/local/lib/ruby/gems/1.9/gems/chef-0.10.8/bin/chef-client:26:in `'
/usr/local/bin/chef-client:19:in `load'
/usr/local/bin/chef-client:19:in `'
[Sun, 01 Jan 2012 23:46:36 -0500] ERROR: Running exception handlers
[Sun, 01 Jan 2012 23:46:36 -0500] FATAL: Saving node information to /var/chef/cache/failed-run-data.json
[Sun, 01 Jan 2012 23:46:36 -0500] ERROR: Exception handlers complete
[Sun, 01 Jan 2012 23:46:36 -0500] FATAL: Stacktrace dumped to /var/chef/cache/chef-stacktrace.out
[Sun, 01 Jan 2012 23:46:36 -0500] FATAL: Chef::Exceptions::Exec: service[chef-client] (chef-client::service line 222) had an error: Chef::Exceptions::Exec: /etc/rc.d/chef-client stop returned 1, expected 0

Essentially, it couldn’t find the PID file in the expected location, which is no surprise, as I had been running chef-client manually without any arguments. Hopefully I can figure out a fix for that soon.

As I mentioned before, Chef appears to work well on mostly Debian and Ubuntu. You will have to do a bit more work on the other OSes: In the case of FreeBSD, a lot more.

Here is one example: The recipe chef-client is used to install startup scripts on the nodes (rc scripts for Red Hat, upstart for Ubuntu, etc). it works on most OSes - except for BSD systems. In fact, in the code, when it noticed it is on a BSD systems, it puts out the following:

when "bsd"
  log "You specified service style 'bsd'. You will need to set up your rc.local file."
  log "Hint: chef-client -i #{node["chef_client"]["client_interval"]} -s #{node["chef_client"]["client_splay"]}"
  
else
  log "Could not determine service init style, manual intervention required to start up the chef-client service."
end

in other words, it doesn’t even bother.

I am not sure if it is out of laziness or just having limited resources that they didn’t create rc scripts for BSD (I could understand OpenBSD, but FreeBSD?), so I created the following rc script:

#!/bin/sh

# PROVIDE: chef
# REQUIRE: LOGIN
# KEYWORD: nojail shutdown

. /etc/rc.subr

name="chef"
rcvar=`set_rcvar`
stop_cmd="chef_stop"
command="/usr/local/bin/${name}-client"
command_args="-i -s -d -L /var/log/chef/client.log -c /etc/chef/client.rb -P /var/run/chef.pid"
load_rc_config $name
export rc_pid
chef_stop()
{
	pidfile="/var/run/chef.pid"
	rc_pid=`cat ${pidfile}`
        kill $rc_pid
}

run_rc_command "$1"

Ordinarily, I shouldn’t have to create a separate function to kill a chef process, but for some reason, the rc functions within FreeBSD can’t find the PID. 

Interestingly enough, during my debugging with the script, through the use of truss I found an undocumented feature where instead of adding the entry to enable a service in /etc/rc.conf, you can put it in /etc/rc.conf.d - which is what I did:

freebsd82# pwd
/etc/rc.d
freebsd82# cd ../rc.conf.d
freebsd82# ls
chef
freebsd82# cat chef
chef_enable="YES"


Apparently it came from NetBSD

With that, I got a working chef init script. Now to see if I can update the chef-client recipe and working on FreeBSD.

verycrispy:

Recently, I have been googling how to make tunnels so I thought I would post what I do. A SSH tunnel allows you to connect to server, A, through server B, from client C.

You generally only want to setup a tunnel when you need to connect to server A but only have access to server B from your…

I have been playing around with Chef for the past week and while I liked it, it was a pain it setup. It seems to be work well if you run Debian and Ubuntu. Everything else … not so much.

First sign of trouble is when I attempt to bootstrap the install. The install calls for installing Ruby from the RBEL repo. Which I don’t have too much with trouble - in fact, they have binary RPMs of chef already available, so I used that initially and installed with:


yum install rubygem-chef-server --disablerepo=updates --disablerepo=CentOS-Custom --disablerepo=extras

(Centos-Custom is my own repo, by the way).

That went well - until it turns out that it installed Ruby 1.8 along with it.

So I got that removed. I spent the next few hours of trying (and failing) to install Ruby 1.9 while avoiding have to install 1.8. In the end, I gave up. Instead, what I did is the following:

  1. Installed the prerequisites  for ruby (including my build of Ruby 1.9 and Rubygems).
  2. Then, I ran “gem install ruby-shadow”, as there was no RPM for it in the CentOS repo.
  3. Then I installed the EPEL repo (instead of the RBEL repo). That allow to proceed with the install of chef with “gem install chef”. That, in turn, took care of all the requirements and package installation.

The next step is to configure a web proxy, as detail here. I decided to deviate slight and just use Red Hat’s utility with:

genkey chef.monzell.com

And then open the firewall ports.

However, because I had SELinux running, apache is not able to communicate to another application (as they are in different security context. So I had to enable access with:

setsebool -P httpd_can_network_connect on

That got me further, but I still had issues. After tailing the audit log and cat the output to audit2allow, I found that I still need to open a port in SELinux:

#============= httpd_t ==============

allow httpd_t reserved_port_t:tcp_socket name_bind;


I enabled access with:

[root@chef audit]# tail audit.log  | audit2allow -M chef444

******************** IMPORTANT ***********************

To make this policy package active, execute:

semodule -i chef444.pp

Installed the module and got the web access working.

There is more, but that’s for another post. :)

(as a side note, is there a tumbler theme that is code friendly - that is, I can paste in code and command line snippets without looking like snot?)

EDIT: Nevermind, looks like I’ll be poking around with CSS again to get it working the way I like.

Finally got my FreeBSD client to authenticate against my OpenLDAP server. 

The configuration is fairly straightforward. What took the time was compilation the dependencies (running it in a VM can do that to it). That and the following issues.

- It seems that Perl is not a requirement for a FreeBSD install. Not a big deal, (thinking about it, it make sense historically), but I needed to get the certs installed - which mean a install of Perl. Fun.

ca-root no longer exists. Had to use ca-root-nss to build.

- After working with Red Hat for a while, manually setting up pam was pain.

- I couldn’t get pass pam_ldap almost all night and part of the afternoon, until I tailed /var/log/auth.log, which showed me this:

User rfoster not allowed because shell /bin/bash does not exist

Bash is not installed by default. Another compile. But afterwards, I was finally able to login.

From there, it was a matter of using amd to work so that I can automount the directories. Using this as a guideline, I setup the symlinks in /usr/home to the mounts:

ln -sf /host/kerberos.monzell.com/exports/users .

Then I add my ldap user to wheel group (so that I can become root):

freebsd82# pw groupmod wheel -m rfoster

freebsd82# pw groupshow wheel

wheel:*:0:rilindo,rfoster

freebsd82# 


And… I am done.

Next, configure SuSE Enterprise Linux 11 with LDAP authentication. :)

I ran into an interesting problem sometime back that I only now resolved.

Originally, I was running Scientific Linux on most of my VM. I have since upgraded most of them to Centos 6.0 - and converted on in particular to Centos CR. 

That “broke” my ldap authentication - when connecting to the server with the ldapuser credentials, sssd returns with the following:

Could not start TLS encryption. TLS error -8172:Unknown code ___f 20

At the time, I thought that there was a bug with the updated sssd package on Centos CR, so I ignored it for while - until I logged into a fairly new Scientific Linux 6.1 VM today - and it gave me the same message.

That was curious. So I dug deep into searching for a solution. 

As it turns out, the problem was my configuration. Apparently I had been connecting with a self-signed certificate, which the following:

[root@localhost ~]# openssl verify cacert.pem 

backup.cacert.pem: C = US, ST = Georgia, O = Monzell Management Systems, OU = IT, CN = example.com, emailAddress = rilindo.foster@example.com

error 18 at 0 depth lookup:self signed certificate

OK

It seems with the initial version of sssd with 6.0 , it was allowing me to connect without complaining about it being a self-sign cert. With the updated version, it is now refusing to connect without a valid certificate. I can confirm that by running:

>

ldapsearch -d -1 -vvvvv -w PASSWORD -ZZZ -H ldap://ldap.example.com -D "cn=root,dc=example,dc=com" "(uid=joeuser)"

<snip>

                                         ..                

TLS: certificate [CN=StartCom Certification Authority,OU=Secure Digital Certificate Signing,O=StartCom Ltd.,C=IL] is not valid - error -8172:Unknown code ___f 20.

tls_write: want=7, written=7

  0000:  15 03 01 00 02 02 30                               ......0           

TLS: error: connect - force handshake failure: errno 0 - moznss error -8172

TLS: can't connect: TLS error -8172:Unknown code ___f 20.

After I put in the correct certificate, I was able to connect:

<snip>

tls_read: want=48, got=48 0000: f0 85 60 72 54 c1 3b c8 6f 53 c4 f0 89 82 27 17 ..`rT.;.oS....'. 0010: 3c 3f 99 8f 18 64 22 ae 41 28 d4 a6 0b 0f a4 de <?...d".A(...... 0020: 36 10 3e d4 6c f5 73 fb cb 12 04 af 64 7f 14 69 6.>.l.s.....d..i TLS certificate verification: subject: E=webmaster@example.com,CN=kerberos.example.com, <REDACTED> ldap_sasl_bind ldap_send_initial_request ldap_send_server_request

<snip>


And openssl returned with no errors:
[root@localhost cacerts]# openssl verify cacert.pem cacert.pem: OK