Category Archives: sysadmin

tektonic resolution

Ok, some time around 11:30ish local time, I noticed that A’tuin wasn’t black holing connection attempts. After a bit of difficulty, I managed to actually connect to the Virtuozzo console – but couldn’t log in. After a few more minutes, it actually let me log in to VZPP, but I noticed that I did not have a functional SSH daemon running.

Much toil and one hour later, I discovered a wierd problem in the system startup scripts and repaired them to actually boot our services correctly again.

The database server is running, the mud is running, the web server is running. Life is, in general, good again.

I am wondering what the official response by TekTonic will be on this issue – since after something well over 30 hours, they still haven’t made a public statement on the outages forum, and haven’t given satisfactory answers to my trouble tickets.

Oh well, problem is over for now. I look for alternative hosting, but am not very hopeful of finding anything with this set of server options anywhere near this price bracket. I wonder if an entire day of unplanned and unexplained downtime is worth the thousand dollars or so a year that I am saving by going with them as opposed to my next best option? Dunno.

At least I’ll be able to sleep tonight w/o waking up to 50 user complaints 😛

tektonic problems

As some of you may know, I have stopped hosting my own servers recently – in stead paying for space on other peoples’ machines. This site (as well as a few others I run) are located on pair Networks space.

However, several of my projects (the mud for one) require more than mere web space. Thus, I investigated alternative options and have been enjoying TekTonic‘s VPS service for the past several months (the hostname being A’tuin).

Yesterday afternoon, they had a serious networking problem. I found out about it around an hour after things went down (my IM started ringing off the hook as users from all over checked in to see what was wrong). I submitted a trouble ticket and got a prompt answer – that things were down and they were fixing them.

An hour later, a sales rep closed my ticket, saying that the problem had been resolved and the affected servers were booting back up.

An hour later, we still didn’t have service. So I started sending follow-ups to the first ticket in hopes of receiving news of the problem. No such luck. They didn’t respond to any of my querries, and they didn’t post any news of the problem to their support forums.

This morning, I sent a 911 ticket to the sales department, and after about 20 minutes, got another short response. This is the last I have heard from them. They’ve still not posted a formal announcement of the details and I am definately not the only user being affected by this.

The TAMS Alumni site is probably located on the same machine as mine (since they’re successive IP’s), and some other users have complained on the forums (this thread, not about our problem but it was the newest thread in the outage category). Another good thread on the subject on their forums is here. There are at least 5 more threads all related to this same problem – and still no official announcement on the subject 😛

My saga of tickets so far goes like this (timestamps are east coast):

#33610: atuin.simud.org unreachable

me, 01/16/2006 6:24:09PM

For at least an hour it seems, my VPS has been inaccessable via any means. An SSH session I had open to the machine was just hung and any attempts to connect to any services running on the account or log in to the web interface at [address] have failed.

Since there is no message posted in the forums, I am assuming this is a new issue. Thank you for getting this back online as soon as possible.

them, 01/16/2006 6:25:49PM

Hi
There is a network issue at present that is affecting a large number of our servers.
We are working on the issue as fast as we can.

Rob
Support

them, 01/16/2006 7:29:42PM

We had some power issues on some the racks, it has been taken care of, the
servers are coming back up and may already be up.


-Ryan M. Adzima
Tektonic Network Solutions | sales@tektonic.net

[ticket closed]

me, 01/16/2006 8:19:06PM

It has been 50 minutes since you declared the problem solved, yet, my machine is still down.

me, 01/16/2006 9:22:31PM

Hello? Any response would be nice.

My VPS is still not up, it has now been about two hours since you said that devices were coming back up. I’m guessing that it doesn’t take this long to boot a machine.

[two more posts of this nature snipped because they’re not that interesting]

#34155: poor customer support

me, 01/17/2006 12:32:46PM

Howdy.

Yesterday afternoon, there seems to have been a fairly big networking/hardware problem that affected multiple servers, including the one that my services are operating off of.

I submitted a trouble ticket and got an almost immediate response – they were actually working on the problem. Then, an hour later, Ryan Adzima closed my ticket saying that problems had been resolved.

They have not. My machine is still inaccessable, and despite my submitting multiple follow-up requests to the ticket, I have not heard back from the support department since they closed my issue in the first place.

As I am rapidly approaching 24 hours of downtime and since that ticket (#33610) is apparently being ignored, I am attempting submission of a new one in the hopes that I will actually get a response this time.

<Insert angry words here>

them, 01/17/2006 12:53:42PM

The support department is quite busy right now dealing with an outage on a particular server that was affected by the power issue. They are troubleshooting the problems but it looks like it may be a hardware issue. I understand that this is an unacceptabel amount of time, but the team has been up all night dealing with it.


-Ryan M. Adzima
Tektonic Network Solutions | sales@tektonic.net

[ticket closed]


I love how he can’t spell the word unacceptable 🙂

jabber advocacy

The last few months have seen me becoming more and more evangelical when it comes to the wonderfulness that is Jabber. I’ve been running jabberd 2 on Hedwig for at least a year now, probably longer (although it looks like I am currently three revisions behind the most recent release…).

I’ve been using jabber slightly longer than that because when complimented by a good client like Psi, I am able not only to secure my communication with the server but with the help of GnuPG, I am able to encrypt conversations with actual people. I pretty much always encrypt conversations with Peter and have used it in the past to securely transfer bank information and stuff.

Peter showed me a releveant post from Drunken Batman’s Blog a little while ago while we were discussing our project for the PyWeek competition that we’ve foolishly engaged ourselves in 😉

The blog pretty much confirms everything I’ve been expecting over the last little bit. The commercial IM providers are looking to kill access to their services by old and third-party clients. They don’t like Google. Google is going to do nice things for Jabber, etc…

OSX isn’t doing bad by Jabber either. Their server release ships with ‘iChat Server’ which is an Applefied front-end for administration of a pretty stock jabberd 1.4 server. By default, all users on the server have jabber accounts. The only tweak I had to do on Alumni was to uncomment server to server functionality. (There is no gui for it, so I had to manually edit the xml config file) Without having done this, the server would have been only useful for internal communication. It is now possible for users to IM me on my personal account in stead of forcing me to log in to the mac.

Upon having fixed this feature, I sent out a huge email to all of our users. I am giving them the promise that if they take the trouble to start using Jabber, they’ll be able to get better response times out of me for their service requests 🙂

I’ve also mentioned Jabber to Sarah again today. We’ll prolly be setting her up with it as well so she doesn’t have to deal with certain elements of ICQ yuckiness – such as getting through pesky firewalls at work 😛 Last night, I wound up getting to set up Kyle and Stori’s computer for the service as well – Dad popped up out of the blue and Dallin and I hed him get things going on the box they’ll be hauling out to her this weekend.

I’ve made sure that Penny’s using her account again on both machines she logs in to and we’ll probably be sneaking it on to her mother’s machine when we’re down babysitting Dallin 😉

alumni migrationness

This post was somehow flagged as a draft for over a year. I’m not sure why, or if I meant to go into details about my displeasure with the OSX mail system. I do remember that I solved the problem and spent the next week figuring out how to actually copy emails from a traditional unix mail spool to the Mac’s Cyrus IMAP database monstrosity. I eventually wrote a java application that acted as an imap client, logged into both email accounts (the old debian machine and the new mac) and copied messages over manually.

But shrug, I figure I may as well activate this post 🙂

– Ammon [Nov 9, ’06]

Yesterday (the 12th), I spent the entire day working on the alumni migration.

I had some usable code that was almost ready to start testing on the 11th – but connectivity went funky. Everyone got kicked from the mud except me; dns resolution was spotty; Hydra seemed able to talk to most of the outside world but not all of the inside; Sora was able to see Hydra in order to IM me – but not Hedwig, who is sitting on the same switch (and KVM). This downtime had the lovely effect of hosing my active file edit, so I lost a good deal of work – but having already done it, the mundane parts flew by.

Manual creation of user accounts under OSX is a tricky thing. It is kind of a chicken vs egg problem – at least when using the password server. In order to create the LDAP auth fields, you need to first register the user’s password. That’s right. They need to have their password in the system before they can create an account 😉 I wound up doing this in like 4 phases – including a good bit of messing around with proc_open() arcana.

That was a fun and exciting problem to solve. I felt a substantial productivity high while working on it and for a good while after I got it working.

Then, of course, things got worse, and the stupid problem arose. It is apparently all but impossible to create an email account on an OSX server without using a gui. That’s right. This is a BSD Unix machine that I spent an entire day hacking from a command-line, and they want me to click on a little box in the management console app in order to allow users to receive mail.

Words really can’t begin to describe my annoyance at this – and I’m not going to try right now, but trust me, it’s a bad and nasty problem. Somebody at Apple (prefferably plenty of people) deserve a healthy dose of violent reeducation.