cs471:cs_471_-_general_sysadmin_principles
Table of Contents
Some General SysAdmin Principles
Documentation
- Document your work.
- Keep a journal.
- Use at least a plain text file.
- Date each entry.
- If you have a boss that you send your journal to, spellcheck your journal.
- The journal isn't just for your boss, it's for you.
- It's your own FAQ.
- It's your own collection of HOWTOs to help you and others reproduce work in the future on other systems.
- When you make an entry in your journal, ask yourself, “Is this enough information to allow me and maybe others to reproduce this work that I just spent hours on?
- It can get huge over time, which is why there may be a need for a better system of self-documentation than a flat text file.
- Possibly something blog-like where you can add tags to each journal entry.
- The journal has to be backed up and accessible from anywhere, at your fingertips.
- This is why some admins keep logs in notebooks (paper notebooks), though this has several drawbacks.
- If your journal is on a company server or your office workstation, and the network is down, and you're at home 30 miles away, and you need some vital information that's in the journal…now what?
- Use comments in /etc/config files to document your changes.
- Sign and date your comments for your future reference and for other admins.
- In general, be well-organized.
- whether that's the directory structure under your home directory or your written journal
- Use technology in documentation.
- Set up a web page or FAQ.
- Use video or digital photography.
- Sometimes, a picture is literally worth a thousand words.
The hard stuff (security and backups)
- Take care of the hard stuff first.
- The worst feelings for a SysAdmin (makes you fear getting fired):
- The system got hacked.
- Something important got deleted, and there's no backup.
- Backups, security hardening, disaster recovery plan
- How good is the backup?
- Do you have a system and plan for doing bare-metal recovery?
- Check if you can actually restore from the backup.
- Run a file system integrity (FSI) checker.
- Is your FSI database and configuration file on a read-only medium?
- From read-only medium or NFS mount
- Run a root kit scanner.
- From a read-only medium or NFS mount
- Be aware of recently discovered security problems and exploits and incidents.
- Do you subscribe to newsgroups and mailing lists that might give you this information?
- Run logwatch (https://sourceforge.net/projects/logwatch/) or other program that alerts you to irregularities in your system logs.
- Run an intrusion detection system.
- Use a firewall.
- IP tables on Linux
- There are useful front-ends and canned firewall rules that you can use.
- Whitelist incoming connections using tcpd (tcp wrapper).
- Whitelist users who are allowed to ssh in.
- Is it obvious what OS you are running, what servers you are running?
- Make it less obvious.
- telnet banners, /etc/issue*, info obtained from telnet to port 25 (SMTP)
- Beware the “It won't happen here” or “Our users aren't that smart” mentality that leads to security problems.
- Think in terms of redundancy.
- Is there a backup of the backup?
Efficiency Matters
- Proxies can reduce bandwidth usage for over-the-internet installs and general internet usage.
- Avoid mass manual installations.
- Try to use cloning as much as possible.
Remote Access / Administration
- Unless every machine you administer is only on from 9am-5pm, system administration is not a 9-to-5 kind of job.
- The ability to work remotely is essential and something that SysAdmins should demand and set up.
- Enable Wake-on-LAN and/or timed power-on in BIOS'es of systems.
- If you have input into the systems that are purchased, stress that these systems come with these types of features.
- Learn to use serial consoles, remote KVM switches.
- Enable more than one way to access a system remotely.
- ”# ifdown eth0“ will kill network access to your system.
- Have you enabled another way to access your system when that happens?
- Learn to use the command line.
- Yes, even on Windows.
- Remote administration is best done at the command line.
- Learn SSH port forwarding.
- One of the few services/protocols that is trusted enough to allow through firewalls.
Startup / Shutdown
- Do some checks before shutting down / restarting a system.
- Is anyone logged on?
- How long can the system afford to be shut down?
- Is there a backup? Is a backup needed?
- Does the system really require a reboot?
- Is the boot loader properly configured?
- Is there an alternate way to access the system in case the bootup fails?
Automation and Scripting
- Learn one scripting language well and use it consistently.
- Using it consistently helps you to learn it well.
- But not a language so obscure that only you use it and understand it.
- Unless you really like typing, develop a set of aliases and short scripts that reduce typing.
- Learn to use the command line (shell) history.
- Increase the size of the command history.
- Periodic processing (cron, at, etc.)
- Processes must be non-interactive to be scheduled…
- or learn expect or equivalent system.
- Test cron script, as you would any other script you write.
- Write scripts so that they are scalable.
- Will still be useful as more systems are added to your admin stable.
- Applies to any administration solution: Think in terms of scalability.
Software Installation/Management, System Maintenance
- Workflow for software installation:
- Look for official (e.g., Debian, Microsoft) source for software that you want to install.
- In the case of Debian, software that can be installed using “apt-get, aptitude, …” without adding third party sources/mirrors.
- If official sources are not available, use a recommended unofficial source or mirror.
- Be sure to get the source's or mirror's “keys” to ensure that the software that you get is “signed.”
- Installed packages should still be upgradeable/maintainable using official package management commands (apt-get, aptitude, yum, etc.)
- If no recommended unofficial source or mirror exist, then get the source code, if available, and attempt to build and install it yourself.
- Here, can either try to build your own installation packages or install using package “sandboxing” tools like stow/xstow.
- Multiple package management schemes on the same system
- Apple, Fink, and MacPorts on MacOSX situation
- Require fiddling with $PATH and library search paths (Linux:/etc/ld.so.conf*)
- Just because a package is “official” does not make it up-to-date (in terms of currency and bug/security fixes).
- These are maintained by people with real lives, and the people are often volunteers, doing this on their own time.
- For timely fixes, you might have to manually compile/install/maintain regularly instead of relying on the “official” package management system.
- Automatic updates
- Sounds nice, but an admin should know what's actually being installed or upgraded.
- Software installs/updates should be an interactive activity.
- But downloading (but not installing) upgrades overnight via scheduling should be fine.
- Upgrade the OS/kernel/software carefully.
- (or apply security patches, services packs, etc.)
- Does it have to be done now?
- Is the system being used? Are users logged on?
- Can it wait until the end of the quarter/semester?
- Have you backed up the system first?
- Will you be able to back out to a previous system state if your upgrade is disastrous?
- Have you tested the upgrade adequately on a test system(s)?
- Have you tested the upgrade after you have applied it?
- Only the high priority security fix patches or packages can be applied/installed in lieu of the whole bundle of upgrades/patches.
- Install only software that is actually used.
- Unnecessary software may contain vulnerabilities.
- But depriving users of software they really need may motivate them to install it themselves, in their home directories.
- Not all software comes nicely packaged.
- such as commercial or “non-free” software
- Sun's Java, icc (Intel compiler), eclipse IDE, netbeans IDE, vmware
- May require compiling it yourself
- and installing in /usr/local or /opt
- Before installing, think about the possibility of uninstalling.
- Compile and install to /usr/local means files scattered to /usr/local/bin, /usr/local/lib, /usr/local/share, /usr/local/etc, /usr/local/sbin.
- Does software's Makefile have a working uninstall target?
Users
- Will rarely describe their problems with a level of detail that is needed to solve their problems.
- Asking for their password so that you can login as them to diagnose their problems, should be avoided.
- If you must, then they must change their password afterward.
- Can use su - userid or some other method to start a process as another user.
- Ask for a screenshot.
- Sometimes as easy as hitting the <Print Screen> key
- Or using their cell phones
- “Eat your own dog food.”
- Use the same environment as the users, the same machines on the same networks as the users, so that you can catch problems they may come to you with.
- Be a user. Use user tools.
- Can't be a UNIX admin w/o first being a UNIX user (Also applies to Windows)
- Do concern yourself with the usability of the OS environment for your users.
- This will have the added benefit of reducing your support headaches.
- Use sensible settings in default shell profiles in /etc/skel
- without compromising security
- no ”.“ in $PATH, umask=077
- an informative shell prompt
- Will the default user interface (window manager or desktop) be familiar and intuitive enough for CS 175 students and also minimize complaints from professors?
- No drive letters and easy access to USB flash drives could frustrate many users, whether they are Linux newbies or not.
- Systems such as autofs work for removable media, though setting them up are usually not easy tasks.
- Balance against ease of administration.
- Simple window managers like IceWM: faster startup, smaller memory footprint, easy to configure (for the admin), familiar-looking interface
- Desktop environments like Gnome/KDE: slower startup, bigger memory footprint, easy to configure (for the user)
- User training
- A support web page with FAQs?
- Occasional “UNIX for Dummies” seminars?
- Don't assume that the users aren't smart enough or that they are not aware of published security exploits.
- The “would never happen here” mentality
Standardization vs. Diversity
- Standardize on one distro of Linux or use multiple distros?
- If standardize, choose your standard distro wisely.
- Consider the projected continuity of a distro.
- Debian's continuity seems like a good bet.
- less so with many Debian derivatives
- Multiple distros also a good idea
- redhat still the most widely used distro, and rpm the most widely used package format
- Good to be familiar with rpm-based distros and redhat-like distros.
- Multiple UNIX versions also a good idea
- What if, by some miracle, SCO wins and Linux is lost?
- FreeBSD waiting in the wings
- Solaris x86? (future looks bleak)
- AIX? (not free; bleh)
- Mac OS X (see AIX)
Advocacy
- Should a UNIX admin advocate the use of UNIX?
- Up to a point
- An admin should advocate the right tool for the job ..
- .. and learn the right tool for the job, even if that happens to be Windows.
Learning Administration
- Learn administration by doing administration.
- More likely to learn if something goes wrong.
Keeping up with the Joneses
- What technologies are being used “out there,” and are we behind the times?
- LDAP instead of NIS
- cfengine instead of my own scripts
- systemimager instead of my own scripts
Ethics and Licenses
- A professor wants to use their single-user license for a language interpretor for an entire class of students ..
- .. and wants you to install the interpretor on the server.
- A student has 2 GB worth of mp3s in his home directory.
- A student has 2 GB worth of legitimate research output in his home directory.
- You have access to your boss' email spool file.
cs471/cs_471_-_general_sysadmin_principles.txt · Last modified: 2018/04/06 18:18 by jchung