Nothing stupid about them...

...I just stole the name from David Letterman's "stupid pet tricks". I hope these tips help you avoid or fix mistakes along the way in your *nix administration duties.

Tuesday, December 20, 2011

A log file is filling up my file system!

Application log files that grow uncontrollably can cause problems, like filling up whatever file system they're located on.

The best way to handle this is to stop the application so you can remove the offending logs, or compress them, then restart the application. But in a production environment you may not be able to stop the application whenever you want.

So, desperate times call for desperate measures. Whack that log file without affecting the application's ability to write to it.


cat /dev/null > /var/app/logfile


...and the app will keep on ticking while you breathe a sigh of relief and set up logrotate!

Sunday, October 9, 2011

Copy 'n' Paste: DON'T

Just happened again today: instead of copying a new config file for some app, you copy 'n' paste from the terminal window on one side of your screen, to your vi or emacs session in another terminal window.

So you end up with something that looks okay, but if there are any long lines in the file, they will be broken up after the paste.

Don't believe me? Copy 'n' paste something like a grub.conf or another file with a long command line from one window to another. That long line will be chopped up by carriage returns. In the case of grub.conf your system will probably not even boot.

Copy:

kernel /vmlinuz-2.6.32-131.0.15.el6.i686 ro root=/dev/mapper/vg_haydn-lv_root rd_LVM_LV=vg_haydn/lv_root rd_LVM_LV=vg_haydn/lv_swap rd_NO_LUKS rd_NO_MD rd_NO_DM LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYBOARDTYPE=pc KEYTABLE=us crashkernel=auto rhgb quiet


Paste:

kernel /vmlinuz-2.6.32-131.0.15.el6.i686 ro root=/dev/mapper/vg_hay
dn-lv_root rd_LVM_LV=vg_haydn/lv_root rd_LVM_LV=vg_haydn/lv_swap rd_NO_LUKS
rd_NO_MD rd_NO_DM LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYBOARDTYPE=p
c KEYTABLE=us crashkernel=auto rhgb quiet




How do you find such offending files if you are trying to troubleshoot? Open the file in vi and look for telltale spaces at the beginning or end of lines that should be continuous. When moving your vi cursor up and down through the file, the cursor should JUMP over wrapped lines.

Don't be lazy. Use the right commands to make and edit config files.

Sunday, July 24, 2011

Linux boot hanging

Some versions of linux have a strange issue upon boot, and a simple Google on "boot hanging udev" will show you that there are many possible reasons.

Let me add one more reason to the pile.

On some versions of Red Hat Enterprise Linux 4, if your system is configured to authenticate users to an LDAP directory of some sort, your issue could be that the system is trying to connect to LDAP before seeking its own hardware! By the time of RHEL 4 update 8 this issue was fixed.

The solution is to boot the system off the DVD to linux rescue mode, allow the system to mount the installation on /mnt/sysimage and to edit /mnt/sysimage/etc/nsswitch.conf and remove all references to LDAP in this file.

On the next reboot you should blow right by udev and watch your system start up.

Thursday, June 17, 2010

Deleting NULL characters

If you ever need to process files from a mainframe, it's likely you will come across a file with lots of NULL characters (ASCII 0).

These characters appear in vi or other editors like this:

^@^@^@^@^@^@^@^@^@^@^@

Ugh.

You can use your old friend sed(1) to get rid of them, but you have to match the pattern of the NULL character. And how do you pass that to sed?

sed -e 's/\o00//g' infile > outfile

That pattern says match octal number 00 and replace it with nothing.

Friday, March 26, 2010

TCP keepalive vs $TMOUT

When a TCP connection times out, it can be annoying at best...and kill applications that depend on the TCP pipe staying up when inactive.

If your shell disconnects after a while, re-log in then enter


$ env | grep TMOUT


If TMOUT is set to non-zero, the session will be disconnected after the number of seconds that TMOUT is set to. If this is an issue, increase TMOUT or set it to zero.

But sometimes you don't have this control over timeouts...idle TCP connections may be configured to drop at the router or firewall level. If you don't have control over those devices, it may be necessary to configure TCP keepalive.

This syntax is for Linux, your OS will be different. Add lines to /etc/sysctl.conf:

net.ipv4.tcp_keepalive_time = 600
net.ipv4.tcp_keepalive_intvl = 60
net.ipv4.tcp_keepalive_probes = 20

then type in

sysctl -p

to get the new values to take effect. The 600 means that keepalive will kick in when the connection has been idle for 10 minutes (600 seconds).

Saturday, April 18, 2009

AIX and its "stanza" structure

Most *nix systems arrange their system data in delimited rows of text files. Pretty simple, eh? Grep for the data you are searching on, then awk out the field you want.

root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin

Not so AIX. This OS favors the "stanza" structure, in which data is presented in this form:

root:
password = XXXXXXXXXXXXX
lastupdate = 1239276751
flags =

joeuser:
password = XXXXXXXXXXXXX
lastupdate = 1239044995
flags =

Ah well, to each their own right? So how do you quickly parse data out of an AIX file? I use this method...probably not the best but it works.

sed -n '/joeuser/,/:/p' /etc/security/passwd

This will return the stanza for joeuser up to and including the next user name below him (:), which you can then grep for whatever field you want.

Friday, March 6, 2009

"I'm root, and you're not!"

We once came across a very unusual problem: root could log in to this system, get a shell, do anything that root normally does. Non-root users however couldn't log in, those that were logged in could not run any commands including the basics, ls, cd, nothing. Only shell built-ins such as echo worked, and only if you were already logged in.

Our support contact finally told us to look at permissions for the root of the file system:

root@mahler:/# ls -ld /
drwx------ 21 root root 4096 2009-01-22 19:27 /
root@mahler:/#

A big "WTF!" Clearly, someone fat fingered a chmod.

By now you have guessed the solution:

# chmod 755 /

Monday, December 15, 2008

Repetitive tasks 101

It happens all the time, but recently I was helping with some troubleshooting. The request was for me to issue a telnet command from our UNIX system so that the network people could see the packets and figure out where they were being dropped.

After about the 20th time, I was getting tired of hitting [up-arrow, enter]. So I entered a script at the command line of this system:


# while :
> do
> telnet 192.168.1.50 &
> sleep 10
> pkill telnet
> done


I advised them that the system would repeat the telnet request every ten seconds, kill it, and launch a new one. Repeatedly. For as long as it took.

Then I went and got coffee.

Wednesday, October 22, 2008

Whew...very strange ARP problem on Solaris 10

File this one under "Current Events"!

If you are running Solaris 10 update 4 or higher, you may notice Ethernet addresses in Address Resolution Protocol (ARP) cache that do not match the actual address. Here's a good discussion of what's happening.

http://forums.sun.com/thread.jspa?threadID=5327921&start=0

The problem is that dual-homed Windows boxes with Broadcom NICs on the same subnet are hosing the ARP table on your Solaris system. Sun says its drivers are not the problem, and they adhere closely to the RFC 826. Regardless, it is a headache for system administrators.

The correct solution is to patch the Windows boxes. But there may be other workarounds, too. Check the comments section of this post in the weeks ahead.

Sunday, September 21, 2008

Kickstarting Linux installs

If you're not familiar with kickstart, it's a way of passing all or some of the parameters to all those questions they ask you at install, greatly speeding up the install process.

We used to do kickstart installs with floppy drives. Simple enough, you write a ks.cfg file to an ext[2-3] formatted floppy disk, boot off your install CD and enter at the boot: prompt:

boot: linux ks=floppy


Lo and behold, servers started showing up without floppy drives, negating the kickstart advantage.

That is until we finally found out how to use a USB thumb drive to accomplish the same thing. Drop the ks.cfg in the root directory of the thumb drive and enter:

boot: linux ks=hd:sda1:/ks.cfg


That's assuming sda1 is your USB device. It could be something different. For instance if you have two internal (logical) raid devices, the device will probably sdc1, and so forth.

Friday, August 29, 2008

The system is running out of swap space, and I have to add more NOW!

Here's how to add swap on the fly. Find a disk partition on your system with enough space to hold a swapfile...let's say in this example, 700 Mb. Then cd to that directory.

Let's start with Solaris which is just a couple of commands:

# mkfile 700m ./swapfile
# swap -a ./swapfile
# swap -l

Instant relief! Now let's tackle Linux which is just a little more involved.


# dd if=/dev/zero of=swapfile bs=1024 count=655350
655350+0 records in
655350+0 records out
# mkswap ./swapfile
Setting up swapspace version 1, size = 671072 kB
# swapon ./swapfile
# free

It's important to note that this additional space will not be mounted as swap on the next reboot unless you add the appropriate commands to the system's startup scripts. But these commands WILL get you through a period of heavy virtual memory usage.

Tuesday, August 19, 2008

I have to grab every file from a web site!

I hope you're not going to sit there at your browser and click away till you get every file. Find yourself a *nix system with the widely-available wget command installed, change to a directory with a lot of space, and make it easy on yourself!

$ wget --mirror http://www.mysite.com


The --mirror option will follow all links to other files on this site.

You may be in a situation where you need to grab all the numerically-named files in one directory, say from file1.html to file3000.html, even if they are not linked.

$ /bin/bash
$ for i in $(seq 1 3000)
> do
> wget http://www.mysite.com/numberfiles/file$i.html
> done

Search

Google