admin log
fd.o admin tasks and planned outages



Fri, 18 Jan 2013

EditGroup on the and wikis

The and wikis have long been plagued with lots and lots of spam. Alan Coopersmith has been doing a heroic effort keeping it at bay, but it's been a manual effort and not really something to rely on, long-term. I've now created an EditGroup on both the wikis and any pages which do not otherwise have ACLs associated with them are now only editable by members of the EditGroup. The group can manage itself, so either ask an existing member or ask on #freedesktop on IRC and we'll add you.

I'd like to apologise for needing to take this step, but the spam problem is bad enough that it's unfortunately needed.

– tfheen
[00:51] | [tfheen] | # | TB

Tue, 02 Oct 2012

Replacement for suffered catastrophic data loss a short while ago. Some projects on were using them to get notifications to their IRC channel. I've now set up KGB, which is a similar service on our own infrastructure. If you want to have this enabled for your project, please file a ticket in bugzilla with information about which repositories should generate notifications sent to which IRC channel.

– tfheen
[06:08] | [tfheen] | # | TB

Thu, 02 Aug 2012

Explanation of today's downtime

Today, more or less all of's infrastructure fell over. This blog post is an attempt at explaining what went wrong and what we're doing to prevent it from happening again. I'd also like to apologise for the outage being much longer than what I consider reasonable.

To understand some of the background, it's useful to look a little bit back in time, when Intel, HP and Google together enabled us to buy some new servers to replace the aging existing ones. We have slowly moved servers into virtual machines, the latest being the bugzilla frontend and backend moving into two new VMs. There are still a few servers left, but we'll get rid of them in the not-too-distant future, or at least, that's the plan.

Today, two of those new machines failed. One, teodor, hosts cgit, annarchy and an admin VM. This seems to have been a kernel bug in KVM land somewhere. For some reason, the other primary server, lyle, also stopped responding at the same time. I'm not sure why that happened, since it was seemingly ok from looking at logs. All this happened this morning, european time. For various reasons, I did not have the iLO passwords, so until I could obtain those, there was no way for me to reset the system, and due to timezones, this took a while. Once I got the passwords, rebooting the systems (as they were unresponsive) was done quickly enough and everything recovered after.

What are we doing to make sure this does not happen again? Primarily, more people now have the iLO passwords, meaning we should be able to respond much quicker. We're also distributing the contact information for the various people a bit, so if shit hits the fan again, getting hold of the right people will be easier.

– tfheen
[14:01] | [tfheen] | # | TB

Mon, 05 Mar 2012

Annarchy moved into a VM

The hardware was running on was getting old and started having memory errors. As a stop-gap measure, the machine has moved wholesale into a virtual machine running on teodor. It also got some more memory and CPU in the process and so should hopefully be a bit snappier.

Sorry about the lack of notification surrounding this, but the hardware errors made posting anything a lot harder, since this blog runs on the host which had errors.

– tfheen
[14:11] | [tfheen] | # | TB

Fri, 06 Jan 2012

New hosts, cgit and more

A while ago, we got some money from Intel, Google and HP for new machines for Due to various logistical problems, it has taken us far too long to actually start using those, but we're now in the process of doing so.

The first service to move to the new infrastructure is cgit. As part of this transition, we'll also be retiring some old services, amongst those are:

  • gitweb; we have cgit and gitweb does not see much traffic
  • cvs; nothing still lives in CVS. We will keep the directories around for the foreseeable future
  • svn; nothing uses this any longer. We will keep the directories around for the foreseeable future

We might also retire, but that is not yet decided. The rationale being that HTTP is widespread and works well for handling downloads. For mirroring, rsync is better suited than ftp

Additionally, over the weekend we'll move bugzilla to the new hosts as well, something that should improve performance quite a bit, but it means bugzilla might be unavailable sometimes during the weekend.

Feedback is most welcome

– tfheen
[11:03] | [tfheen] | # | TB

Sun, 16 Oct 2011

Bugzilla and annarchy upgraded

Bugzilla has now been upgraded from 3.4.6 to 4.0.2. This enables XML-RPC support, we have the option of moving to a new bug workflow and fixes a bunch of security issues.

To support this, was upgraded from Debian 5.0.9 (lenny) to Debian 6.0.3 (squeeze). I believe everything is working fine again, but if you see something broken, please tell us and we'll try to fix it.

– tfheen
[03:43] | [tfheen] | # | TB

Sat, 24 Apr 2010

Mailman bounces

The mailman setup on gae was slightly broken in such a way that it did not process bounces. I'm not sure how long it's been that way, but I suspect quite a long time. Some of the list admins might have noticed that a large amount of people were unsubscribed a couple of days ago. This is just a backlog of people who should have been unsubscribed already, some of them years ago, so no need to worry.

– tfheen
[00:51] | [tfheen] | # | TB

Mon, 19 Apr 2010

Email/greylisting trouble

The postgrey daemon on gabe had decided to greylist everything in the world. I am not sure for how long it did so, but it was restarted yesterday and mail seems happier today. Likewise, mailman on had stopped working and was restarted. If you got a small torrent of emails from lists, this is why. I don't yet know what the reason for the trouble was, but will investigate, and I'm sorry for the problems.

– tfheen
[23:42] | [tfheen] | # | TB

Fri, 02 Apr 2010

Bugzilla upgraded

Bugzilla has now been upgraded from 3.0.8 to 3.4.6. In addition, I have installed splinter which should help with patch reviews. Bugzilla has also moved from mysql to postgres, but this should not have any user-visible impact.

Please tell us if there are problems.

– tfheen
[08:23] | [tfheen] | # | TB

Wed, 31 Mar 2010

Bugzilla upgrade 2010-04-02 morning UTC

I'll be upgrading Bugzilla on this Friday, so Bugzilla will be unavailable for a little while. No data should be lost, but you will obviously not be able to file new bugs or comment on existing bugs while the upgrade is in progress. – tfheen
[01:22] | [tfheen] | # | TB

Mon, 22 Mar 2010

Planned downtime 2010-03-25 morning UTC

I'll be installing new kernels on all hosts this Thursday morning, expect a little bit of downtime and naturally, reboots of all machines. – tfheen
[13:48] | [tfheen] | # | TB

Commit mails broken (and fixed again)

In order to fix a problem with BSD `mailx` and UTF-8 (yay legacy tools), I installed heirloom-mailx. Unfortunately, this also broke any commit hooks that used `-a` to set custom headers. Unfortunately, it doesn't look like Heirloom has a way to set arbitrary headers, so I have reverted this change. Sorry about the trouble, and you should have your commit emails back again. – tfheen
[13:44] | [tfheen] | # | TB

Fri, 22 Aug 2008

HEADSUP: pkg-config hosting/wiki updated

Just a quick note to let ppl know that pkg-config's hosting has now changed. The website has moved from gabe->annarchy and the wiki has been upgraded to support catchpa's. Please stay tuned whilst dnsexpires. If you 'Down for Maintenece' your dns cache just needs to expire.
[06:52] | [benjsc] | # | TB

Fri, 27 Jun 2008

Bugzilla SSL Certificate Upgraded - CACert signed

After lots of nagging, the Bugzilla ssl certificate has finally been upgraded to be signed by CACert. With any luck Mozilla will eventually support CACert as a issuing agent and then the now even more vicious warnings the browsers are issuing will simply disappear. - benjsc
[00:03] | [] | # | TB

Wed, 14 May 2008

fd.o logins

As you may have noticed, all SSH logins on fd.o have been disabled while we fix up the OpenSSL/OpenSSH trainwreck. We'll try to have them back as soon as possible: hopefully before Thursday afternoon UTC. -daniels
[18:11] | [daniels] | # | TB

Sun, 10 Feb 2008

Annarchy is back & patched

Turns out some bugs in the Linux kernel are quite annoying. Annarchy became a little unstable recently but has now been patched to fix the issues. Please let me know if you notice any regressions.

- benjsc

[18:18] | [benjsc] | # | TB

Wed, 23 Jan 2008

Planet.fd.o is back

Turns out that planet.fd.o was not up to date due to a corrupt bdb cache file.

The cache was nuked, the rss' rebuilt. Hence planet is back in operation.


[15:03] | [benjsc] | # | TB

Tue, 22 Jan 2008

NEEDINFO returns to Bugzilla - sortof

Hi Folks, Thanks for your patients in this, I was only looking into this earlier today and think I've found a solution that works - sort of. The background behind NEEDINFO disappearing is due to Mozilla removing it in v3.0.0 version of bugzilla. Their reasoning is to force accountability of every bug. Ie: a bug is always assigned to someone.

Since we run a stock standard bugzilla we found out about the removal the hard way. From an admin point of view I'd prefer not to modify bugzilla as changes get lost or need to be reimplemented each time bz is upgraded.

However, after watching how ppl in work, it's clear we do need something; I think implementing NEEDINFO as a keyworks is the best option for now and hence I've now added:


as a keyword. To get a list of all the bugs without the keyword sadly an advanced search ( is needed. If you don't want to figure out the advanced search simply add:


to any search url and you'll get the relevant bugs removed - you can then save that as a custom search.

[17:04] | [benjsc] | # | TB

The Spam Cleanup Begins

It seems that spam has slowly been building up on many of the fd.o mailing lists. Infact a backlog of ~80k messages awaiting moderation exists. I've started the cleanup process so don't be surprized if some old messages start appearing on the mailing lists. I expect this cleanup to take about a week.


[17:03] | [benjsc] | # | TB

Sun, 13 Jan 2008

Git Upgraded on kemper (aka.

Just a note that git has been upgraded on kemper from 1.4.4 to 1.5.x. If you notice any issues please let me know asap. -
[19:44] | [benjsc] | # | TB

Fri, 11 Jan 2008

Bugzilla Upgrade Complete

Hi Folks, The upgrade of to Bugzilla 3.0.3 is now complete.

The upgrade brings about a major change to how bugs are processed. The 'NEEDINFO' bug state was removed in Bz3.0.0. Instead the 'ASSIGNED' state should be used as shown in .

Whilst this might seem like an unintuitive change, Bz3 supports email correspondence. Hence if you need more information when working on a bug, indicate this to the bug reporter, set the bug state to ASSIGNED then submit. From that point on you can work with the user via email to resolve the problem.

All bugs previously in the the NEEDINFO state have now been reopened so feel free to assign any of them to yourself. Many of you will have received *lots* of email from this change, unfortunately it was unavoidable.

Finally if you notice any issues with the upgrade please let me know.


[04:39] | [benjsc] | # | TB

Tue, 08 Jan 2008

Bugzilla Upgrade Reminder

Just a reminder folks, Bugzilla is finally headed to version 3.0.2 on Friday 11/01/2008, 6pm AEST. During this time you won't be able to do anything bugzilla related. - Benjamin
[16:22] | [benjsc] | # | TB

Tue, 28 Aug 2007

LDAP speed bump

We recently hit a bit of pain migrating the LDAP database, and lost a few accounts. No user data was lost, just the entries in the group and password tables. I've attempted to reconstruct them as best I can, but there's still a few missing, either because they were not in any project groups before the migration, or because I can't find a GPG key for the account anywhere.
You'll know if this affects you because you won't be able to log in. If this is you, hunt me down on IRC and we can fix things up.
- ajax
[13:44] | [ajax] | # | TB

Thu, 22 Jun 2006

ViewVC upgrade

ViewCVS was flipping out for no good reason, so I upgraded us to ViewVC for all our legacy repobrowsing needs. I appear to have made it work, which is a bit surprising as my apache-fu is not strong. I also did just about the bare minimum of theme tweaking to make it look like an fdo site. Let me know if anything looks weird or fails or whatever.
- ajax
[14:03] | [ajax] | # | TB

Tue, 09 May 2006


With our spiffy new machines in place (thanks Google!), we've been slowly (emphasis on the word 'slowly') moving our services to the new machines, to take the load off the creaking gabe. annarchy is the designated general-access machine, as well as serving web stuff. planet.fd.o is already moved, as are xorg.fd.o and xcb.fd.o, and we're slowly migrating all the other wikis to annarchy (updating to a new Moin in the process). people.fd.o has moved too, so don't update your public_html on gabe; it's not useful anymore.
[03:18] | [daniels] | # | TB

care and feeding of your fd.o account

Long time no update (none this year, in fact). Oops.

One oft-neglected area of fd.o seems to be account maintenance. Probably something to do with the only procedure being 'ask sitewranglers for updates', and said sitewranglers disclaiming all responsibility, but not properly documenting the new procedure. Oops. :)

The procedure for account maintenance has been documented on the wiki for a while now. The new account policy is, in short: file a bug assigned to the project you want to join, attach (not paste) GPG and SSH public keys, get approval from someone in a position of authority in the relevant project, and reassign to the product, and the New Accounts component.

The procedure for care and feeding of your account is a bit more complex. We have a mail gateway set up to deal with most everything (including changing your SSH keys), provided you still have the GPG key you signed up with. There's some loose documentation on the AccountMaintenance page on the wiki; there's also some more comprehensive documentation at the Debian site (we use the same system, but without the web interface, at least for now).

Of course, if you need to attach a GPG key to your account, or something else out of the ordinary, the sitewranglers are still at your beck and call.
[03:02] | [daniels] | # | TB

Fri, 14 Oct 2005

New fd.o admin

I added Thomas Vander Stichele (of gstreamer) as a admin, so he can share the load of adding new users and the other mundane things that the rest of us admins slack off on doing.

[17:10] | [anholt] | # | TB

Sun, 11 Sep 2005

create.fd.o progress

Added a mysql database for the create.fd.o wiki.

[15:02] | [anholt] | # | TB

Mon, 29 Aug 2005

Welcome to xorg, vektor

vektor (Billy Biggs) has been added to the xorg group for the purpose of merging his bugfixes and optimizations to the fb code. Welcome!
[21:44] | [anholt] | # | TB

Wed, 24 Aug 2005

daniels gave me privs, so I went and closed some bugs.

Added Alexandre Prokoudine (prokoudine) for create.fd.o projects.

Added the openfontlibrary project, which sounds pretty awesome, and added rejon, bryce, nicubunu, and prokoudine. Created a bugzilla product for it. Now I need to figure out SVN and mailing lists.

Added the virtual host and got it all set up.

Briefly broke www.fd.o apparently by using /etc/init.d/apache2 reload instead of restart.

[23:07] | [anholt] | # | TB

Mon, 15 Aug 2005

disks break, hilarity ensues

So. Uhm. Yeah, stuff.

One of the disks in gabe's RAID array failed recently, and when the machine came back up from a reboot, the RAID controller took the disk's word for granted. Oops. The damage is confined to the root partition, so only stuff in /etc and /var is gone (in terms of data loss).

We've spent some time vi'ing config.pck files from Mailman, and attempting to reconstruct list configurations (most seem OK, except dbus, which looks to be rather terminally broken), and getting rid of extremely fun filesystem corruption all through the Postfix spool directory, etc. However, there's still a lot of transient breakage around. In particular, many bug attachments in Bugzilla went to a better place -- they simply disappeared. Some lists and list archives in particular are broken.

If there are any specific problems, please let us know on IRC on #freedesktop, or at 'gabe is broken, help pls', 'the website is broken', and 'wtf is up with bugzilla?' are not terribly helpful. Please try to be very specific.
[08:05] | [daniels] | # | TB

Sun, 10 Jul 2005

Bugzilla update

Updated to the security release of 2.18.3.
[09:26] | [anderson] | # | TB

Sat, 22 Jan 2005

fd.o updates

So, in the past few weeks, a lot has happened with In the manic pace of reconstruction that followed (take down machines for basic forensics, sit down and scribble out plans, attempt to acquire hardware, back up, reinstall, restore, build up new infrastructure), it can be said that we haven't been fantastically communicative. Most of the things that happen get mentioned on IRC shortly before or after they get done, as that's easier. So, sit back and enjoy the ride.

What happened?
On Nov 15th at approximately 00:07 PST, an intruder got access to fd.o via a simple TWiki shell injection attack. They were running PsyBNC and some other IRC proxy as www-data, in /var/tmp (/var/tmp/.cache, /var/tmp/.tmp, and a couple of others). We do not believe (indeed: really quite sure), in retrospect, that they gained root. However, at the time, we did not know this, and the Debian compromise showed us that root compromises aren't necessarily known at the time. So, following due paranoiaprocess, we decided to do it the hard way -- take it down and start from scratch.

And the recovery?
Myself and keithp did the initial installation, and Adam Conrad helped get a whole bunch of services up and configure them beyond that point. The existing services were quite typical of the problems we've had with fd.o -- it was a whole sort of general mish-mash that didn't scale well. Around twenty hacked-together 'solutions', nothing of which really worked well on its own, let alone coherently. This gave us an opportunity to sit down, and so we did -- projects have their own space under /srv/, and most services are segregated under there also (CVS, Bugzilla, et al). Because the structure was so radically different and at the time we didn't know if /home had been compromised, we decided to restore the old home directories as /home/compromised, and leave the rest as is. This means that if you had an account previously, your home directory -- including public_html -- has not been restored. This also holds true for projects.

Authentication is being handled with Debian's userdir-ldap. We have segregated the LDAP server on to kara, another machine located on an internal network. I originally set up OpenLDAP with SSL/TLS across the network, but that fell over within an hour; the load there being me setting some stuff up, and keithp poking around. Clearly, it was unsuitable for general use. We're now using the excellent userdir-ldap system, which is also deployed within Debian, which lets users manage their own records to a degree. The admin interface is far, far nicer than we ever had. The previous system spat out '/usr/bin/dialog is unhappy with your terminal' all the time for no apparent reason, would return a blank dialog for success on any operation, and a dialog saying 'Successfully [did stuff] to [object]' upon failure. It was also pretty incomplete.

While we were at it, we dealt with migration to Apache 2, new ViewCVS, a newer version of Subversion (and hopefully the fsfs backend), newer Bugzilla, a chrooted pserver for CVS (/srv/, a chrooted BIND 9, et al.

Where are we now?
Right now, we're standing on a pretty strong grounding, I believe. The standards are back up, thanks to Waldo Bastian and Chris Lee, and we now have a far stronger system for dealing with users/projects; translators are now properly on the cards. I am in the middle of writing a small daemon to deal with download.fd.o. The idea is that people upload tarballs and whatever along with a GnuPG-signed .changes file, which gets processed by the queue reaper, and put into the download.fd.o structure. With proper segregation of projects and write access only through a given daemon which logs an audit trail, we believe this will be a good bit more secure. Not least, it also gives us something we can properly present to mirrors, so we can get all the important bits mirrored out to the world.

Where are we going?
Brrrrr. Many member projects -- GStreamer, modular X, others -- are going really cool places. As an organisation, we still have several works in progress. download.fd.o is a large part of that, and we also have other infrastructural work (mainly separation of services) that are pending getting some more machines. The most important work has been done; mainly reinstallation, getting a proper, scalable, setup together to deal with many projects, separating out the LDAP server, download.fd.o is almost done, etc. The main thing that remains to be done is for projects to deal with migrating to the new structures (if you need to be active, email, and for pretty much everyone in the project to get GnuPG keys. Yes, you.

That aside, there's not a terrible lot to be done; most of my personal wishlist for infrastructure and services has been dealt with, which makes me happy. As does the prospect of a (relatively) long and relaxing Christmas/New Year holiday period. Cheers.

Daniel, on behalf of the sitewranglers
[02:03] | [daniel] | # | TB

Mon, 17 Jan 2005

Bugzilla update

Updated to the official release of 2.18.
[15:11] | [anderson] | # | TB

Wed, 12 Jan 2005

hooray for isec

Threw 2.6.10 on to gabe this evening with the new iSec patch for a nice little privilege-escalation race on SMP machines; SSH access was out for about half an hour. -daniels
[07:34] | [daniel] | # | TB

Thu, 06 Jan 2005

Bugzilla patch

Patch bugzilla 2.18.rc3 plug a XSS vunerability.
[19:38] | [anderson] | # | TB

Tue, 14 Dec 2004

apache limits

I bumped the MaxClients (and ServerLimit) in apache2.conf from 20 to 500 which matches the old apache1 configuration. I'd like to know why we're not running the fancy threaded apache configuration; that should be able to handle a lot more clients. -keithp
[09:28] | [keithp] | # | TB

Sun, 21 Nov 2004 virtual host

I created a virtual host for that just runs viewcvs.cgi directly. Check out /etc/apache2/sites-enabled/ for details. -keithp
[18:43] | [keithp] | # | TB

Sat, 20 Nov 2004

brave new wor^Wgabe

As you may have noticed, sort of got compromised a few days back. By 'sort of', I do, of course, mean 'totally'. Adam Conrad noticed a few thousand bounces in his inbox courtesy of being on www-data, and that they were all for spams being sent as www-data. Whoops. We started hunting for an insecure, but when we took a look at lsof and discovered an IRC proxy running, we decided it was something more insidious. From there, the machine got killed to all access but ours, and we started tracking down the point of entry. It turned out that it was compromised via a hole in TWiki, but no news was to be found on the TWiki site about this hole, nor was there a new release. How not to do security 101.

At this point, we came to the conclusion that all we could do from here was reinstall, so Keith got a call (from an Australian mobile, roaming into the UK, to a US mobile; I fear to think how much that cost) letting him know the score. Local muscle on site, we dug in and prepared for a reinstall. Most people familiar with the setup (and my writings on here about 'ill') know that the setup was accumulated, not designed, and was horrifically out of control. It was a mess, and probably incredibly insecure. Very few things were done properly to scale to where it was. So, we took a deep breath, and noted that this was a blessing in disguise as we got to sit back and have a think about what we were doing this time. I got out some pieces of paper and started scribbling (across six of them, actually), and we all got chatting on what we could do when we rebuilt it all.

LDAP is already running on a separate machine, using Debian's userdir-ldap. We have a separate source machine on our hitlist; hosting only CVS/SVN/Arch repositories, and various web downloads. These downloads would have to be signed for somehow, and all provided in a common download area. Three huge hits: we're mirrorable, there's an audit trail and security on source, and the general access machine and the source server are totally separate. Rock on.

SSH access is open to the general public, with the old home directories in /home/compromised. If you administer a project with CVS or whatever, please check that it hasn't been tainted. You can compare the repositories in /cvs and /compromised-cvs to see the difference; /cvs contains the repositories as they were on 15th Oct.

Administering your account requires a GPG key. Admins will be rather loathe to perform menial duties (e.g. changing SSH keys) on a regular basis, so if you ask us for anything, make sure it's to add a GPG key to your account. This way, it's the same amount of work for us, and it ensures that you can take care of your own account in future: less work for both of us, and less time spent waiting.

Did I mention you should all have GnuPG keys? No, really. They're incredibly useful. If we had signed copies of everything, verification would be an utter doddle. But we don't, so it isn't.

Enjoy the new -daniels
[16:31] | [daniels] | # | TB

more drm repocopies

Copied some sis stuff missed in the previous repocopies.
[15:42] | [anholt] | # | TB

About Administrators




Web Sites