Archive for the ‘unix’ Category

Helpful error messaging

Friday, July 19th, 2013

Trying to get all of our various flavors of Linux and Unix working with Kerberos/GSSAPI, and just loving the ever helpful errors it produces, stuff like


Unspecified GSS failure.  Minor code may provide more information

No error

Seriously?   It seems likely to be some sort of reverse-DNS issue, but you’d never know it from the messaging you can see.

Something new

Monday, June 24th, 2013

The other day, I saw something at work, that I’d never seen before.


I’ve been using ping for …a while…., I’ ve never seen a “DUP!” warning  before


64 bytes from (x.x.x.x) : icmp_seq=0 …

64 bytes from (x.x.x.x) : icmp_seq=0 …  (DUP!)

64 bytes from (x.x.x.x) : icmp_seq=1 …


Presumably it indicates a routing quirk in our post-move networking setup.  It’s not clear if it’s a sign-of-worse-quirk, or a odd-but-inconsequential-quirk.

STAF mystery error of the day

Friday, March 15th, 2013

Have been having lots of fun, going through all of our testing drones, and fixing the STAF.cfg for the change in IP addresses.  Much of the task is blocked because I have to wait for IT to cable and turn on the machines, so they’ve been coming online, about 5-6 per day.  In today’s batch, one of the machines gave an error I’ve never seen  before, and that has google completely stumped, a true zero-results query.  Well now it will have one result =p


“Could not open file /usr/local/staf/codepage/UTF16_BigEndian.bin

WARNING: Defaulting to LATIN_1”

The box is an RHEL itanium (yes, I work with strange and rare beasties), and the file doesn’t exist indeed.  Nothing even remotely named like that exists, on any of my 30+ variations of Unix and Linux.    There is an alias.txt file, which references files like UTF16_BigEndian and UTF16_LittleEndian, but in the directory, the only .bin files are named things like ibm-907.bin.  The UTF16_BigEndian line in the alias file references an alias, ibm-1200, but while there are are lots of sequential files before ibm-1164.bin, and plenty after ibm-1250.bin, there is no 1200, again, on any of my platforms.


So far, in my limited testing, STAF is working OK…but I suspect I need to find a test that would want to use UTF16 chars, to see anything.  Would make sense.  So probably not =p

Archivemount in linux

Wednesday, January 9th, 2013

As I was putting together yet another Linux machine, ubuntu flavored this time, cursing Unity 3D, I wondered…for all that linux has done to mimic the features of windows, why hadn’t they implemented compressible filesystem folders, like you’ve had available since Windows NT.  And it turns out they have something, archivemount.

It is not exactly the same as a compressed folder under windows, and is far from a transparent solution, but you can mount a zip file as a read/write directory, and add files to it, and it seems to work.

I had a leftover uncompressed copy of the original archive when I tried it the first time, but haven’t seen it happen again yet.  I suspect that all the effort I went to, creating my zip files with maximum compression, is wasted any time I write to an archive.  I suspect it just re-compresses at the default level.

And the example in the man page, for unmounting, isn’t correct, at least not under ubuntu.  Even though I did the mount operation as a regular user, I have to umount with sudo, or it fails with an error about the mount path not being in fstab, and me not being root anyways.

I will have to see how will it holds up under repeated uses.

Physical access == pownership

Thursday, December 27th, 2012

One of the key lessons of computer security is that physical access is important.

We are preparing for moving to a new office, in the new year.  A contractor has been tasked with finding the owners of all the miscellaneous boxes we have scattered about.  He was getting stuck on one of the Sparc Solaris 8 machines, with none of the typical root passwords working.  I showed him how, with a copy of the OS cdrom, it is trivial to reset the root password.  The hardest part is identifying which disk slice to mount, when you are in single-user mode, and that’s more of a tedious task of try until it works, than any sort of actual though or research.

Accidental solution

Wednesday, September 19th, 2012

I’m trying to debug a problematic interaction, between our software, and SELinux on RHEL6. Under default SELinux=enforcing configurations, our server fails with

Error while loading shared libraries : /usr/lib/xxxx : cannot enable executable stack as shared object requires

This is a known issue with how one of our modules is built, that isn’t scheduled to be addressed in the near future (the part that requires changing has a lengthy government certification processes, we want all changes to this area done at the same time to limit the number of times we have to certify). It’s been fixable in RHEL5 with a simple chcon -t textrel_shlib_t /usr/lib/xxxx.

But for some reason, while the same command gives no errors back, it also doesn’t prevent the problem that keeps us from running, under RHEL6.

One of the many suggestions for fixing or debugging the issue was to build a custom policy using audit2allow, and deal with it that way. Basically, you set your SELinux machine to permissive, do the offending operation, and them take the errors that it generated-but-ignored, build a policy with them, and then you can set your box back to enforcing, add the policy, and bobsuruncle. So I bring up /etc/selinux/config in my handy editor, but because I’m distracted by other things at work, I don’t notice there are two configurable values in the file, and instead of changing SELINUX, I changed SELINUXTYPE. Which is where things get odd.

According to the docs, the only valid values for SELINUXTYPE are targeted and mls, but I set it to permissive. I didn’t notice this, do my product install, and everything works, as expected. I go to set the config file back to default values, at which point I notice my error. Hrm, I think to myself. Looking in /var/log/audit/audit.log, there aren’t any errors for audit2allow to work off of. I put the config back to default, reboot the box, and miraculously, things are still working.

It’s hard to feel like I’ve really fixed the problem, but it sure doesn’t seem to still be occurring, so….

PAM hate

Monday, September 17th, 2012

One of my least favorite parts of the current job is PAM (Pluggable Authentication Module). I can never find sufficient documentation to explain why things work, or more often don’t work, the way they do, versus what I expect. Every flavor on UNIX implements their PAM with quirks.

Today, one of my cow-orkers asked for some help trying to determine if the unexpected PAM results he was seeing was a problem with our product, or a problem with the configuration of the environment. In the end, we were able to determine that our product was correctly calling the pam module, but couldn’t explain why the module wasn’t returning expected values for our test accounts.

As he was leaving my office, the other guy expressed his understanding why so many of the help pages we found started out with some variation of, “I hate PAM, but have to…”

UTF-8 and od

Wednesday, June 13th, 2012

One of the many challenges of my current job is trying to create tests that will function the same, across 30+ flavors of UNIX-y-type systems. There are lots of seemingly simple things that just don’t work the same way for AIX, HPUX, Solaris, and the various Linuxi we support. Sometimes even on the same OS family, changing processor architecture leads to unexpected issues. Eventually I figure most of them out, or at least a way around the differences that are irrelevant to my testing.

Today’s challenge, validating UTF-8 support for filenames. Not entirely unexpected, piping UTF-8 strings through STAF via Java, into local system grep utilities isn’t providing consistent results. Entirely unexpected, the octal-dump program, ‘od’, is returning different results for the same source file, depending on the host machine, completely independent of my STAF issues.

STAF and impersonated identities

Wednesday, May 30th, 2012

Recently, one of my AIX test drones had to be rebuilt from the ground up. After IT had reinstalled the OS and tools, they turned the box back over to me for the finishing tasks necessary to use it as part of our automated test system. I did my part, and started a sample test run. Failures everywhere, and slow, oh so slower than things should be.

Debug, debug, debug, oh how curious, I’m getting permissions errors when I try to execute a remote command using ‘sudo’ via STAF.

Run ‘whoami’ as STAF remote command, return ‘root’. ‘visudo’ shows root and wheel have permissions to execute anything without a password.

But I’m seeing a password prompt written to /dev/console. hrm.

Oh look, in our code we tell STAF to impersonate another user. A user who is in the wheel group, so that still shouldn’t be a problem. But for some reason it is. Looking at the impersonated user,

$ groups
staff wheel

OK, visudo and make a copy of the wheel authorization line, but for the staff group.

Now things work OK.

Doesn’t seem right, but there it is. Something about running STAF as root, impersonating a user, who them calls sudo, trips up if the key-group isn’t the first group listed.

What else will be off?

How useful of you (AIX)

Wednesday, April 4th, 2012

Turns out you can re-size (at least if your are only growing) partitions on a live system, under AIX 7.1, even stuff like / and /var . Color me surprised.