Sometimes it is useful to take a closer look on the performance of jobs using the toolbox that Linux has. We will not describe the details of these tools but just mention a few.

Using the ssh command line tool it is possible to login into nodes where one of your jobs is running. Please note that you can only login into these nodes. A connection to another node will be refused.

Logging in can be done like this:

ssh node34

After this you should obtain a command-line prompt on this node (if you have a job running on that node!). You can use squeue described earlier to see where your job is running.

There are two tools that we will describe on this page. The first is top, the second ps.

top

The top tool will show an overview of running processes on the system. You can limit it to your own processes using the -u option. The tool will show the CPU utilization at %CPU, the memory usage using VIRT for virtual (claimed) memory and RES for memory that is really used. The CPU usage should ideally be close to 100% for single core tasks and close to n*100% for multithreaded tasks where n is the number of threads.

To see the individual threads of a multithreaded program you can press shift-H for uppercase H. The q key will exit top.

ps

Another useful tool is ps. ps will show processes on the node. To see an extensive overview of the users processes you can use:

ps -elf | egrep "UID|$USER"

Please see the man page of ps for information about the meaning of the fields.