solutions- gnegg

ServeRAID – Fun with GUI-Tools

We’ve recently bought three more drives for our in-house file server. Up until now, we had a RAID 5 array (using a IBM ServeRAID controller) spawning three 33GB drives. That array recently got very, very close to being full.

So today, I wanted to create a second array using the three new 140GB drives.

When you download the ServeRAID support CD image, you get access to a nice GUI-tool which is written in Java and can be used to create Arrays on these ServeRAID controllers.

Unfortunately, I wasn’t able to run the GUI at first because somehow, the Apple X11 server wasn’t willing/able to correctly display the GUI. I always got empty screens when I tried (the server is headless, so I had to use X11 forwarding via ssh).

Using a Windows machine with Xming (which is very fast, working perfectly and totally free as in speech) worked though and I got the GUI running.

All three drives where recognized, but one was listed as “Standby” and could not be used for anything. Additionally, I wasn’t able to find any way in the GUI to actually move the device from Standby to Ready.

Even removing and shuffling the drives around didn’t help. That last drive was always recognized as “Standby”, independant of the bay I plugged it into.

Checking the feature list of that controller showed nothing special – at first I feared that the controller just didn’t support more than 5 drives. That fear was without reason though: The controller supports up to 32 devices – more than enough for the server’s 6 drive bays.

Then, looking around on the internet, I didn’t find a solution for my specific problem, but I found out about a tool called “ipssend” and there was documentation how to use it in an old manual by IBM.

Unfortunately, newer CD images don’t contain ipssend any more, Forcing you to use the GUI which in this case didn’t work for me. It may be that there’s a knob to turn somewhere, but I just failed to see it.

In the end, I found a very, very old archive at the IBM website which was called dumplog and contained that ipssend command in a handy little .tgz archive. Very useful.

Using that utility solved the problem for me:

# ./ipssend setstate 1 1 5 RDY

No further questions asked.

Then I used the Java-GUI to actually create the second array.

Now I’m asking myself a few questions:

Why is the state “Standby” not documented anywhere (this is different from a drive in Ready state configured as Standby drive)?
Why is there no obvious way to de-standby a drive with the GUI?
Why isn’t that cool little ipssend utility not officially available any more?
Why is everyone complaining that command line is more complicated to use and that GUIs are so much better when obviously, the opposite is true?

The pain of email SPAM

Lately, the SPAM problem got a lot worse in my email INBOX. Spammers seem to more and more check if their mail gets flagged by SpamAssasin and tweak the messages until they get through.

Due to some tricky aliasing going on on the mail server, I’m unable to properly use the bayes filter of SpamAssasin on our main mail server. You see, I have an infinite amount of addresses which are in the end delivered to the same account and all that aliasing can only be done after the message has passed SpamAssassin.

This means that even though mail may go to one and the same user in the end, it’s seen as mail for many different users by SpamAssassin.

This inability to use Bayes with SpamAssassin means that lately, SPAM has been getting through the filter.

So much SPAM that I began getting really, really annoyed.

I know that mail clients themselves also have bayes based SPAM filters, but I often check my email account with my mobile phone or on different computers, so I’m dependent on a solution that filters out the SPAM before it reaches my INBOX on the server.

The day before yesterday I had enough.

While all mail for all domains I’m managing is handled by a customized MySQL-Exim-Courier setting, mail to the @sensational.ch domain is relayed to another server and then delivered to our exchange server.

Even better: That final delivery step is done after all the aliasing steps (the catch-all aliases being the difficult part here) have completed. This means that I can in-fact have all mail to @sensational.ch pass through a bayes filter and the messages will all be filtered for the correct account.

This made me install dspam on the relay that transmits mail from our central server to the exchange server.

Even after only one day of training, I’m getting impressive results: DSPAM only touches mail that isn’t flagged as spam by SpamAssassin, which means that it’s carefully crafted to look “real”.

After one day of training, DSPAM usually detects junk messages and I’m down to one false negative every 10 junk messages (and no false positives).

Even after running SpamAssassin and thus filtering out the obvious suspects, a whopping 40% of emails I’m receiving are SPAM. So nearly half of the messages not already filtered out by SA are still SPAM.

If I take a look at the big picture, even when counting the various mails sent by various cron daemons as genuine email, I’m getting much more junk email than genuine email per day!

Yesterday, tuesday, for example, I got – including mails from cron jobs and backup copies of order confirmations for PopScan installations currently in public tests – 62 genuine emails and 252 junk mails of which 187 were caught by SpamAssassin and the rest was detected by DSPAM (with the exception of two mails that got through).

This is insane. I’m getting four times more spam than genuine messages! What the hell are these people thinking? With that volume of junk filling up our inboxes how ever could one of these “advertisers” think that somebody is both stupid enough to fall for such a message and intelligent enough to pick the one to fall for from all the others?

Anyways. This isn’t supposed to be a rant. It’s supposed to be a praise to DSPAM. Thanks guys! You rule!

Intel Mac Mini, Linux, Ethernet

If you have one of these new Intel Macs, you will sooner or later find yourself in the situation of having to run Linux on one of them. (Ok. Granted: The situation may be coming sooner for some than for others).

Last weekend, I was in that situation: I had to install Linux on an Intel Mac Mini.

The whole thing is quite easy to do and if you don’t need Mac OS X, you can just go ahead and install Linux like you would on any other x86 machine (provided the hardware is sufficiently new to have the BIOS emulation layer already installed – otherwise you have to install the Firmware Update first – you’ll notice by the mac not booting from the CD despite holding c during the initial boot sequence).

You can partition the disk to your liking – the Mac bootloader will notice that there’s something fishy with the parition layout (the question-mark-on-a-folder icon will blink one or two times) before passing control to the BIOS emulation which will be able to boot Linux from the partitions you created during installation.

Don’t use grub as bootloader though.

I don’t know if it’s something grub does to the BIOS or if it’s something about the partition table, but grub can’t launch stage 1.5 and thus is unable to boot your installation.

lilo works fine though (use plain lilo when using the BIOS emulation for the boot process, not elilo)

When you are done with the installation process, something bad will happen sooner or later though: Ethernet will stop working.

This is what syslog has to say about it:

NETDEV WATCHDOG: eth0: transmit timed out
sky2 eth0: tx timeout
sky2 eth0: transmit ring 60 .. 37 report=60 done=60
sky2 hardware hung? flushing

When I pulled the cable and plugged it in again, the kernel even oops’ed.

The macs have a Marvel Yukon ethernet chipset. This is what lspci has to tell us: 01:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8053 PCI-E Gigabit Ethernet Controller (rev 22). The driver to use in the kernel config is “SysKonnect Yukon2 support (EXPERIMENTAL)” (CONFIG_SKY2)

I guess the EXPERIMENTAL tag is warranted for once.

The good news is, that this problem is fixable. The bad news is: It’s tricky to do.

Basically, you have to update the driver with the version that is in the repository of what’s going to be kernel 2.6.19

Getting a current version of sky.c and sky.h is not that difficult. Unfortunately though, the new driver won’t compile with the current 2.6.18 kernel (and upgrading to a pre-rc is out of the question – even more so considering the ton of stuff going into 2.6.19).

So first, we have to patch in this changeset to make the current release of sky compile.

Put the patch to /usr/src/linux and patch with patch -p1

Then fetch the current revision of sky2.c and sky2.h and overwrite the existing files. I used the web interface to git for that as I have no idea how the command line tools work.

Recompile the thing and reboot.

For me, this fixed the problem with the sky2 driver: The machine in question is now running for a whole week without any networking lockups – despite heavy network load at times.

While happy to see this fixed, my statement about not buying too new hardware (posting number 6 here on gnegg.ch – ages ago) if you intend to use Linux on it seems to continue to apply.

OS X 10.4.8 – Update gone wrong

Today, Software Update popped up and offered me to upgrade the OS to 10.4.8.

Usually I’m turning down such offers as I don’t want to reboot my system in mid-day, but it felt like a good time to do it none the less. This is why I accepted.

After the installation, the update asked me to reboot which I accepted.

What came afterwards was as scary as it was ironic: The system rebooted into Windows XP.

But not worries: The 10.4.8 update isn’t a windows installation in disguise: The Windows installation that greeted me was the one I have on a second partition – mostly to play WoW (which I don’t any more).

A quick reboot showed me even more trouble: Whenever my MacBook tried to boot from the MacOS partition, it showed the folder-with-question-mark icon for a few seconds and then the EFI BIOS emulation kicked in and booted from the MBR, which is why I was seeing Windows on my screen.

Now, I’d gladly explain here what has gone wrong and how I fixed it, but as I was in a state of panic, so I have not exactly documented my fix and as I tried many steps at once without getting confirmation if the step has fixed the problem, I don’t even know what was wrong (which certainly doesn’t stop me from guessing).

Anyways.

I booted from the MacBook DVD and first selected disk utility in the tools menu and let it check the disk for errors (none found as I have expected) and then let it repair permissions (tons of errors found, but I doubt this was the problem).

Then I quit the disk utility and launched terminal.

Beside the fact that I had some trouble actually entering commands (how do I set the keyboard layout in that pre-install-terminal?), I quickly went to /System/Libary, deleted the Extensions cache (Extensions.kextcache), went to /System/Library/Exentsions and removed all Extensions installed by Parallels (which I suspected being responsible for the problem).

I think the list was vmmain.kext, helper.kext, Pvsnet.kext and hypervisor.kext. You have to remove them with rm -r as they are bundles (directories)

After that, I rebooted the system and the question-mark-on-a-folder disappeared and the updating process completed.

I can’t tell you how scared I was: My OS X installation is tweaked to oblivion and I’d really, really hate to lose all the stuff. Don’t mind the data – it’s configuration files and utilities and of course fink.

*shudder*

As I have not tried to reboot after completing each of the steps above, I’m unable to say what actually caused the problem. I doubt it was Parallels though as I’m currently running 10.4.8 and Parallels (which I had to reinstall of course). I also doubt it was the permissions issue as wrong permissions are unlikely to cause boot-failure.

So it probably was a corrupted Extension cache. Or the update process not able to cope with the Parallels extensions.

Me being in the dark makes me unable to place blame, so you won’t find any statement about how a more or less forced OS update should never cause a failure like this…

For all I know, this could have happened without the update anyways.

The good news on the other hand is that I’m slowly reaching a state where I am as good at fixing macs as I am good at fixing Windows and Linux. Just don’t tell this to my friends who have macs.

SQLite, Windows Mobile 2005, Performance

As you know from previous posts, I’m working with SQLite on mobile devices which lately means Windows Mobile 2005 (there was a Linux device before that tough, but it was hit by the RoHS regulation of the European union).

In previous experiments with the older generation of devices (Windows CE 4.x / PocketPC 2003), I was surprised by the high performance SQLite is able to achieve, even in complex queries. But this time, something felt strange: Searching for a string in a table was very, very slow.

The problem is that CE5 (and with it Windows Mobile 2005) uses non-volatile flash for storage. This has the tremendous advantage that the devices don’t lose their data when the battery runs out.

But compared to DRAM, Flash is slow. Very slow. Totally slow.

SQLite doesn’t load the complete database into RAM, but only loads small chunks of the data. This in turn means that when you have to do a sequential table scan (which you have to do when you have a LIKE ‘%term%’ condition), you are more or less dependant on the speed of the storage device.

This what caused SQLite to be slow when searching. It also caused synchronizing data to be slow because SQLite writes data out into checkpoint files during transactions.

The fix was to trade off launch speed (the application is nearly never started fresh) for operating speed by loading the data into an in-memory table and using that for all operations.

attach ":memory:" as mem;

create table mem.prod as select * from prod;

Later on, the trick was to just refer to mem.prod instead of just prod.

Of course you’ll have to take extra precaution when you store the data back to the file, but as SQLite even supports transactions, most of the time, you get away with

begin work;

delete from prod;

insert into prod (select * from mem.prod);

commit;

So even if something goes wrong, you still have the state of the data of the time when it was loaded (which is perfectly fine for my usage scenario).

So in conclusion some hints about SQLite on a Windows Mobile 2005 device:

It works like a charm
It’s very fast if it can use indexes
It’s terribly slow if it has to scan a table
You can fix that limitation by loading the data into memory (you can even to it on a per-table basis)

Upgrading the home entertainment system

The day when I will finally move into my new flat is coming closer and closer (expect some pictures as soon as the people currently living there have moved out).

Besides thinking about outdated and yet necessary stuff like furniture, I’m also thinking about my home entertainment solution which currently mostly consists of a Windows MCE computer (terra) and my GameCube (to be replaced with a Wii for sure).

The first task was to create distance.

Distance between the video source and the projector. Currently, that’s handled simply by having the MCE connected to the projector via VGA (I’d prefer DVI, but the DVI output is taken by my 23″ cinema display I) and the GC, the PS2 and the XBox360 via composite to my receiver and the receiver via composite to the projector.

The distance between the projector and the receiver/MCE is currently about three meters tops, so no challenge there.

With a larger flat and a ceiling mounted projector, interesting problems arise distance-wise though: I’m going to need at least 20 meters of signal cable between receiver and projector – more than what VGA, DVI or even HDMI are specified for.

My solution in that department was the HDMI CAT-5 Extreme by Gefen. It’s a device which allows sending HDMI signals over two normal ethernet cables (shielded preferred) and reaching up to 60 meters of distance.

Additionally, CAT-5 cables are lighter, easier to bend and much easier to hide than HDMI or even DVI cables.

Now, terra only has a DVI and VGA out. This is a minor problem though as HDMI is basically DVI plus audio, so it’s very easy to convert a DVI signal into a HDMI one – it’s just a matter of connecting pins on one side with pins on the other side – no electronics needed there.

So with the HDMI CAT-5 Extreme and a DVI2HDMI adaptor, I can connect terra to the projector. All well, with one little problem: I can’t easily connect the GameCube or the other consoles any more: Connecting them directly to the projector is no option as it’s ceiling mounted.

Connecting them to my existing receiver isn’t a solution either as it doesn’t support HDMI, putting me into the existing distance problem yet again.

While I could probably use a very good component cable to transport the signal over (it’s after all an analog signal), it would mean I have three cables going from the receiver/MCE combo to the projector: Two for the HDMI extender and one big fat component cable.

Three cables to hide and a solution at the end of its life span anyways? Not with me! Not considering I’m moving into the flat of my dreams.

It looks like I’m going to need a new receiver.

After looking around a bit, it looks like the DENON AVR-4306 is the solution for me.

It can upconvert (and is said to do so in excellent quality) any analog signal to HDMI with a resolution of up to 1080i which is more than enough for my projector.

It’s also said to provide excellent sound quality and – for my geek heart’s delight – it’s completely remote-controllable over a telnet interface via its built-in ethernet port – even bidirectional: The – documented – protocol provides events on the line when operating conditions change by different events, like the user changing the volume on the device.

This way, I can have all sources connected to the receiver and the receiver itself connected to the projector over the CAT-5 Extreme. Problems solved and considering how many input sources and formats the denon supports, it’s even quite future-proof.

I’ve already ordered the HDMI extender and I’m certainly going to have a long, deep look into that Denon thing. I’m not ready to order just yet though: It’s not exactly cheap and while I’m quite certain to eventually buy it, the price may just fall down a little bit until November 15th when I’m (hopefully) moving into my new home.

Windows Vista, Networking, Timeouts

Today I went ahead and installed the RC2 of Windows Vista on my media center computer.

The main reason for this was because that installation was very screwed (as most of my Windows installations get over time – thanks to my experimenting around with stuff) and the recovery CD provided by Hush was unable to actually recover the system.

The Hard Drive is connected to a on-board SATA-RAID controller which the XP setup does not recognize. Usually, you just put the driver on a floppy and use setup’s capability of loading drivers during install, but that’s a bit hard without a floppy drive anywhere.

Vista, I hoped, would recognize the RAID controller and I read a lot of good things about RC2, so I thought I should give it a go.

The installation went flawlessly, though it took quite some time.

Unfortunately, surfing the web didn’t actually work.

I could connect to some sites, but on many others, I just got a timeout. telnet site.com 80 wasn’t able to establish a connection.

This problem in particular was in my Marvel Yukon chipset based network adapter: It seems to miscalculate TCP packet checksums here and there and Vista actually uses the hardwares capablity to calculate the sums.

To fix it, I had to open the advanced properties of the network card, select “TCP Checksum Offload (IPv4)” and set it to “Disabled”.

Insta-Fix!

And now I’m going ahead and actually start to review the thing

lighttpd, .NET, HttpWebRequest

Yesterday, when I deployed the server for my PocketPC-Application to an environment running lighttpd and PHP with FastCGI SAPI, I found out that the communication between the device and the server didn’t work.

All I got on the client was an Exception because the server sent back error 417: Precondition failed.

Of course there was nothing in lighttpd’s error log, which made this a job for ~~Ethereal~~Wireshark.

The response from the server had no body explaining what was going on, but in the request-header, something interesting was going on:

Expect: 100-continue

Additionally, the request body was empty.

It looks like HttpWebRequest, with the help of the compact framework’s ServicePointManager is doing something really intelligent which lighttpd doesn’t support:

By first sending the POST request with an empty body and that Expect: 100-continue-header, HttpWebRequest basically gives the server the chance to do some checks based on the request header (like: Is the client authorized to access the URL? Is there a resource available at that URL?) without the client having to transmit the whole request body first (which can be quite big).

The idea is that the server does the checks based on the header and then either sends a error response (like 401, 403 or 404) or it advises the client to go ahead and send the request body (code 100).

Lighttpd doesn’t support this, so it sends that 417 error back.

The fix is to set Expect100Continue of System.Net.ServicePointManager to false before getting a HttpWebRequest instance.

That way, the .NET Framework goes back to plain old POST and sends the complete request body.

In my case that’s no big disadvantage because if the server is actually reachable, the requested URL is guaranteed to be there and ready to accept the data on HTTP-level (of course there may be some errors on the application level, but there has to be a request body for them to be detected).

Tracking comments with cocomment

I’m subscribed to quite a long list of feeds lately. Most of them are blogs and almost all of them allow users to comment on posts.

I often leave comments on these blogs. Many times, they are as rich as a posting here as I got lots to say once you make me open my mouth. Many times, I quietly hope for people to respond to my comments. And I’m certainly eager to read these responses and to participate in a real discussion.

Now this is a problem: Some of the feeds I read are aggregated feeds (like PlanetGnome or PlanetPHP or whatever) and it’s practically impossible to find the entry in question again.

Up until now, I had multiple workarounds: Some blogs (mainly those using the incredibly powerful Serendipity engine) provide the commenter with a way to subscribe to an entry, so you get notified per Email when new comments are posted.

For all non-s9y-blogs, I usually dragged the link to the site to my desktop and tried to remember to visit them again to check if replies to my comments where posted (or maybe another interesting comment).

While the email method was somewhat comfortable to use, the link-to-desktop one was not: My desktop is enough cluttered with icons without these additional links anyways. And I often forgot to check them none the less (making a bookmark would guarantee myself forgetting them. The desktop link at least provides me with a slim chance of not forgetting).

Now, by accident, I came across cocomment.

cocomment is interesting from multiple standpoints. For one, it just solves my problem as it allows you to track discussions on various blog entries – even if they share no affiliation at all with cocomment itself.

This means that I finally have a centralized place where I can store all my comments I post and I can even check if I got a response on a comment of mine.

No more links on the desktop, no more using bandwidth of the blog owners mail server.

As a blog owner, you can add a javascript-snippet to your template so cocomment is always enabled for every commenter. Or you just keep your blog unmodified. In that case, your visitors will use a bookmarklet provided by cocomment which does the job.

Cocomment will crawl the page in question to learn if more comments were posted (or it will be notified automatically if the blog owner added that javascript snippet). Now, crawling sounds like they waste the blog owners bandwidth. True. In a way. But on the other hand: It’s way better if one centralized service checks your blog once than if 100 different users each check your blog once. Isn’t it?

Anyways. The other thing that impresses me about cocomment is how much you can do with JavaScript these days.

You see, even if the blog owner does not add that snippet, you can still use the service by clicking on that bookmarklet. And once you do that, so many impressive things happen: In-Page popups, additional UI elements appear right below the comment field (how the hell do they do that? I’ll need to do some research on that), and so on.

The service itself currently seems a bit slow to me, but I guess that’s because they are getting a lot of hits currently. I just hope, they can keep up, as the service they are providing is really, really useful. For me and I imagine for others aswell.

Tag: solutions