Timing

How long does it take

Except for the plain shell (sh), the other shells offer a built-in time command that displays how long a command takes to execute. The timing is highly accurate and can be used on any command. In the example below, it takes about 15 seconds to make this book:


tille@blubber:~/training/unix-basics>time docbook2html abook.xml
Working on: /nethome/tille/training/unix-basics/abook.xml
Done.

real    0m15.468s
user    0m14.590s
sys     0m0.510s

In this example, the system spent 15.368 seconds in total, devided into 14.590 seconds spent executing usercommands and 0.51 seconds executing kernel routines (system time).

The C shell time gives some more information. It displays total elapsed time: the program in the example might only take 15 seconds to execute for the system, but the total time the user has to wait before the command terminates and gives him back his prompt will be longer and depends on the system load (how busy is the system?). The C shell time also displays the CPU time as a percentage of the total elapsed time, virtual memory and I/O statistics. The output is controlled by the time environment variable.

Why does it take so long?

Performance

To a user, performance means quick execution of commands. To a system manager, on the other hand, it means much more: the sysadmin has to uptimize system performance for the whole system, including users, all programs and daemons. System performance can depend on a thousand tiny things which are not accounted for with the time command:

  • the program executing is badly written or doesn't use the computer appropriately

  • access to disks, controllers, display, all kinds of interfaces etc.

  • reachability of remote systems (network performance)

  • amount of users on the system, amount of users actually working simulatneously

  • time of day

  • ...

What is a high load?

In short: this depends on what is normal on your system. My old P133 running a firewall, SSH server, fileserver, a routedaemon, a sendmail server, a proxy server and some other services doesn't complain with 7 users connected, the load is still 0 on average. Some (multi-CPU) systems I've seen were quite happy with a load of 67. There is only one way to find out: check the load regularly if you want to know what's normal. If you don't, you will only be able to measure system load from the response time of the command line, which is a very rough measurement since this speed is influenced by a hundred other factors.

Keep in mind that different systems will behave different with the same load average. E.g. a system with a graphics card supporting hardware acceleration will have no problem rendering 3D images, while the same system with a cheap VGA card will slow down tremendously while rendering. My old P133 will become quite uncomfortable when I start the X server, but on a modern system you hardly notice the difference in the system load.

What can I do as a user?

A big environment can slow you down. If you have lots of environment variables set (in stead of shell variables), long searchpaths that are not optimized (errors in setting the path environment variable) and such, the system will need a longer time to search and read data.

In X, windowmanagers and desktop environments can be real CPU eaters. A really fancy desktop comes with a price, even when you can download it for free, especially since most desktops provide add-ons ad infinitum. Modesty is a virtue if you don't buy a new computer every year.

Priority

The priority or importance of a job is defined by it's "nice" number. A program with a high nice number is friendly to other programs, other users and the system: it is not an important job. The lower the nice number, the more important a job is and the more resources it will take without sharing them.

Making a job nicer by increasing its nice number is only useful for processes that use a lot of CPU time (compilers, math applications and such). Processes that only use a lot of I/O time are automatically rewarded by the system and given a higher priority (a lower nice number). E.g. keyboard input always gets highest priority on a system.

Defining the priority of a program is done with the nice command. Behaviour may vary, so it would be best that you read the manpage for your vendor specific nice command.

Most systems also provide the BSD renice command, which allows to change the niceness of a running command. Again, read the manpage for your system specific information.

Note

It is NOT a good idea to nice or renice an interactive program or a job running in the foreground.

CPU resources

On every Unix system, many programs want to use the CPU(s) at the same time, even if you are the only user on the system. Every program needs a certain amount of cycles on the CPU to run. There may be times when there are not enough cycles because the CPU is too busy. The uptime command is wildly inaccurate (it only displays averages, you have to know what is normal), but far from being useless. There are some actions you can undertake if you think your CPU is to blame for the unresponsiveness of your system:

  • Run heavy programs when the load is low, e.g. during the night. See next section for scheduling.

  • Prevent the system from doing unnecessay work: stop daemons and programs that you don't use, use locate in stead of a heavy find, ...

  • Run big jobs with a low priority

If none of these solutions are an option in your particular situation, you may want to upgrade your CPU.

Memory resources

When the currently running processes expect more memory than the system has physically available, a Unix system will not crash, it will start paging or swapping (using the memory on disk or swapspace: moving contents of the physical memory (pieces of running programs or entire programs in the case of swapping) to disk, thus reclaiming the physical memory to handle more processes). This slows the system down enourmously since access to disk is much slower than access to memory. The top command can be used to display memory and swap use. If you find that a lot of memory and swap is being used, you can try:

  • Killing or stopping those programs that use a big chunk of memory

  • Adding more memory (and in some cases more swapspace) to the system. As a rule, a Unix system needs at least twice the amount of available physical memory for swapping.

  • Tuning system performance, which is beyond the scope of this document. See reading list for more.

I/O resources

While I/O limitations are a major cause of stress with sysadmins, the Unix system offers rather poor utilities to measure I/O performance. The ps, vmstat and top give some indication about how many programs are waiting for I/O, netstat and nfsstat display network interface statistics and the various proprietary implementations of Unix may offer vendor-specific I/O subsystem information, but there are virtually no tools available to measure the I/O response to system load.

Each device has its own problems, but we can safely say that on the one hand the bandwidth available to network interfaces and on the other hand bandwidth available to disks, are the two most important bottlenecks in I/O performance.

Network I/O problems:

  • Network overload:

    The amount of data transported over the network is larger than the network's capacity, results in slow exectution of every network related task for all users. They can be solved by cleaning up the network (which mainly involves disabling protocols and services that you don't need) or by reconfiguring the network (e.g. use of subnets, replacing hubs with switches, upgrading interfaces and equipment).

  • Network integrity problems:

    Occurs when data is transferred incorrectly. Solving this kind of problem can only be done by isolating the faulty element and replacng it.

Disk I/O problems:

  • per-process transfer rate:

    Read or write speed for a single process.

  • aggregate transfer rate:

    Maximum total bandwidth that the system can provide to all programs that run.

Users

Users can be devided in several classes, depending on their behaviour/resource usage:

  • Users who run a large number of small jobs (e.g. the trainees in this courseroom, users that spend a lot of time editing or using Unix utilities).

  • Users who run relatively few but large jobs (e.g. users running simulations, calculations or emulators, usually with accompanying large datafiles).

  • Users who run few jobs but use a lot of CPU time (e.g. developers).

You can see for yourself that system requirements may vary for each class of users, and that it can be hard to satisfy everyone.