June 19, 2013
dear #openstack people.
I just read
From now on you will stop it with the cutsie naming.
the network bits will be called ‘network’
the compute bits will be called ‘compute’
the block storage will be called ‘blockstore’
the object store will be called ‘objectstore’
the authn/z bits will be called ‘authenticaton’
the image storage will be called ‘imagestore’
If there are other major components you need – they will named precisely based on what they are.
If you rev those pieces in major ways you will just iterate the major version number.
If you cannot cope with these rules someone is going to drop heavy things near your toes.
You have used up all your name change turns. You are done.
June 11, 2013
A discussion last week made me think of the following:
Ansible as a mechanism to provide network/infrastructure-wide cron.
A couple of systems that do major administrative tasks could have a infra-cron file like:
01 04 * * * root run_system_wide_task
0 01 * * Sun root trigger_client_backups
Now, I’m sure lots of you are saying ‘yes, that’s cron, you don’t need another one’ but with ansible you could have an orchestrated cron. A cron that properly says ‘wait for the previous task to finish before you launch this other one’ or a cron that is able to better contingency handling if some of your systems are offline or disconnected.
I don’t have any code for this but I wanted to toss it out as a potentially odd idea that maybe someone would love.
Got a ridiculous process **cough**Jenkins**Cough** that you have to wait to create a dir before doing things?
This might help you as godawful ugly as it is.
– name: wait for a dir to exist – this is just ugly
shell: while `true`; do [ -d /var/lib/jenkins/plugins/openid/WEB-INF/lib/ ] && break; sleep 5; done
May 17, 2013
Working on something for Spot I revived some code I had written a
few years ago and then discovered that other people had made much more
robust leveled topological sorts than I had written
Anyway – if you grab the files from:
python buildorder.py /path/to/*.src.rpm
it will look up the interdependencies of the src.rpm to figure out a
build order. It outputs a bunch of different things:
1. a flat build order
2. a build order broken out by groups – you can build all the pkgs in
any group in parallel provided that all the pkgs in the previous group
have finished building.
3. outputs lists of direct loops between srpms.
4. probably will output A LOT of noise and garbage from the rpm
specfile parsing from the rpm.spec() module
But it might be worth a look at and, ideally, patches to make it a bit
If you have a set of pkgs which you need to build but you can’t figure
out the buildorder this might help you out.
I’d love to know how often it is right or ‘right enough’.
1. some spec files make the rpm.spec() parsing break in interesting
ways – sometimes tracing back
2. if a pkg is not dependent on any other pkg and nothing else depends
on it – they get lumped in the last grouping. Not really an issue -
just something someone noticed and was surprised.
3. It will handle file-buildreqs not at all, it will handle virtual
provide buildreqs, not at all, if your buildreqs are REALLY picky about
requiring <= Version – it will ignore all of that.
4. I fully expect that 2 or more level circular build deps (foo req bar
req baz req quux) will not be detected but will make the topological
sort function die). If so…. tough… go fix your packaging.
Anyway – give it a run and see if it helps you solve a problem.
If it does let me know about it. Some of us are curious if this could
fit well in mockchain or wrapped around/in mockchain.
April 29, 2013
We needed more space for cinder and had no nice way to expand it on our existing cinder server so after banging my head a bit I got assistance from Giulio Fidente who was able to show me a working config that let me figure out what I was missing. Below I document it so others might be able to find it, too.
NOTE: this works under folsom on rhel 6.4. I cannot vouch for anything else -but Giulio had it running on grizzly I think so…
You have an existing cinder server setup and running – which includes
a volume server, an api service and a scheduler service. You need to
add more space and you have a system where that can run.
Here’s all you need to do:
1. install openstack-cinder on the server you want to be a new volume server
2. make sure your new system can access the mysql server on your primary
3. make sure tgtd knows to import the files /etc/cinder/volumes
4. make sure your other computer nodes can access the iscsi-target port
iscsi-target 3260/tcp on the system you want to add as an cinder-volume server
5. setup your /etc/cinder/cinder.conf
sql_connection = mysql://cinder_user:cinder_pass@mysqlhost/cinder
auth_strategy = keystone
rootwrap_config = /etc/cinder/rootwrap.conf
rpc_backend = cinder.openstack.common.rpc.impl_qpid
qpid_hostname = qpid_hostname_ip_here
volume_group = cinder-volumes
iscsi_helper = tgtadm
iscsi_ip_address = my_volume_ip
logdir = /var/log/cinder
state_path = /var/lib/cinder
lock_path = /var/lib/cinder/tmp
volumes_dir = /etc/cinder/volumes
6. start tgtd and openstack-cinder-volume
service tgtd start
service openstack-cinder-volume start
7. check out /var/log/cinder/volume.log
8. Verifying it worked:
on your cloud controller run:
cinder-manage host list
you should see all of your volume servers there.
9. creating a volume. – just make a volume as usual – the scheduler
should default to the volume server with the most space available
10. on your new cinder-volume server run lvs to look for the new volume.
Things I learned today:
1. the predictable network device naming stuff in systemd is kinda arbitrary when it comes to cloud imgs that may run on a variety of virt systems – so to turn it off just add this to your %post in your kickstart:
# disable systemd ‘predictable’ device names for networks w/a hammerln -s /dev/null /etc/udev/rules.d/80-net-name-slot.rules cat > /etc/sysconfig/network-scripts/ifcfg-eth0 << EOF DEVICE="eth0" BOOTPROTO="dhcp" ONBOOT="yes" TYPE="Ethernet" EOF
That last bit is just to make a generic ifcfg-eth0 so ifup eth0 works normally. 2. the hostonly initramfs that dracut makes now plays up when you are moving an image around. make sure you add
to %packages to get it to behave as you’d expect
3. if you don’t have a lot of memory then you may not want tmpfs for /tmp - to turn that off just do:
systemctl mask tmp.mount
in %post and it will be as you’d expect.
4. syslinux-extlinux is WAY nicer and simpler to use than grub2
Thanks to Mattdm for making the syslinux-extlinux option for anaconda happen.
April 25, 2013
A while back I wrote this for func – and I found I needed it ported to ansible.
I enhanced it to make it take more than just 2 systems. It can now compare any number of systems to the base system
Takes a first argument of your ‘baseline’ host that’s the host all the other hosts package sets will be compared to.
It grabs the list of rpms installed on each system (just using rpm -qa, I’m lazy, or I could have used the yum list=installed option)
It transforms that output into a set – then does a difference on them each way.
Output looks like this
$ ans_rpm_compare.py app01.phx2.fedoraproject.org app02.phx2.fedoraproject.org
Packages on app01.phx2.fedoraproject.org not on app02.phx2.fedoraproject.org
Packages on app02.phx2.fedoraproject.org not on app01.phx2.fedoraproject.org
Trivial but should be straightforward to follow how it works in the code.
No idea where else to put it so it goes into my scripts git repo.
April 22, 2013
If you see this traceback in your /var/log/mailman/error file
File “/usr/lib/mailman/Mailman/Queue/Runner.py”, line 120, in _oneloop
File “/usr/lib/mailman/Mailman/Queue/Runner.py”, line 191, in _onefile
keepqueued = self._dispose(mlist, msg, msgdata)
File “/usr/lib/mailman/Mailman/Queue/ArchRunner.py”, line 73, in _dispose
File “/usr/lib/mailman/Mailman/Archiver/Archiver.py”, line 216, in ArchiveMail
File “/usr/lib/mailman/Mailman/Archiver/pipermail.py”, line 583, in processUnixMailbox
File “/usr/lib/mailman/Mailman/Archiver/pipermail.py”, line 635, in add_article
article.parentID = parentID = self.get_parent_info(arch, article)
File “/usr/lib/mailman/Mailman/Archiver/pipermail.py”, line 669, in get_parent_info
if parentID and not self.database.hasArticle(archive, parentID):
File “/usr/lib/mailman/Mailman/Archiver/HyperDatabase.py”, line 273, in hasArticle
File “/usr/lib/mailman/Mailman/Archiver/HyperDatabase.py”, line 251, in __openIndices
t = DumbBTree(os.path.join(arcdir, archive + ‘-’ + i))
File “/usr/lib/mailman/Mailman/Archiver/HyperDatabase.py”, line 65, in __init__
File “/usr/lib/mailman/Mailman/Archiver/HyperDatabase.py”, line 170, in load
self.dict = marshal.load(fp)
ValueError: bad marshal data
It is due to a corrupted archive database. Those live in /var/lib/mailman/archives/private/$list/database/*
In order to figure out which one it is – you have to run this:
import os, sys
for fn in sys.argv[1:]:
c = marshal.load(open(fn))
against the files in the dir I mentioned above.
python thatscript /var/lib/mailman/archives/private/$list/database/2013-April*
That will tell you if a file is busted, (it will print out an exception) but it won’t fix it.
You will probably need to run it against all of the current files for all the lists you have
Once you figure out which lists are broken you SHOULD be able to run
bin/arch –wipe listname /var/lib/archives/private/$list.mbox/$list.mbox
and have it recreate the whole thing.
March 27, 2013
I’m trying to produce a simple list of instances on the fedora
openstack instance. I want to produce a list every 10m or so and diff
it against the last copy of that list and output the changes.
Here’s what I came up with:
it is based originally on nova-manage. It runs as root on the head system in our cloud and just dumps out json, then diffs the json.
Everything works but I’m trying to figure out if this is the ‘right’
way of going about this.
I thought about doing it via nova instead of using the nova-manage
direct-to-db api but I had 2 problems:
1. I would need to save the plaintext admin pw somewhere on disk to
poll for that info
2. or get a token which I would have to renew every 24 hours
We’re using the above the script as a simple cron job that lets us know
what things are changing in our cloud (who is bringing up new
instances, how many, what ips they are attaching to them, etc)
Additionally, is there a way in the db api to easily query the tenant and user info from keystone? I’d like to expand out the user uuid into username/project name.
March 22, 2013
Dealing with a potential problem we were trying to figure out a way to proxy/redirect git:// calls from one server to another one. This is a fairly ridiculous script I hacked up in the wee-small hours of thursday morning after talking to+Sitaram Chamarty on #git for a while.
I fully expect this won’t work well under load but it does seem to function in my small tests here.