Skip to main content

Linux Workstation Information

Linux

Our workstations use Ubuntu Linux and are upgraded every two years to the latest long-term release version of Ubuntu. If you aren’t familiar with Linux, you’ll want to learn a little about using commands in a terminal window. Here’s a helpful introduction to the command line that should get you started.

Please do not shut down a Linux workstation, even if you are done using it for the day. It needs to be left running so other users can run jobs on it remotely and the backup servers can access it for the nightly backups.

Data Storage

Each user has a home directory on network storage with a path like /home/{lab-name}/{user-name} where {lab-name} is replaced with the name of the lab (e.g., kuhn) and {user-name} is replaced with the login name of the user. Home directories are backed up nightly to our backup servers so they are the ideal location for important documents and critical data. Take care to limit how much you keep in your home directory so that it doesn’t overload our backup servers. (Note: each home directory may also be accessed using the path /net/{lab-name}/home/{user-name}.)

In addition to a home directory, users also have two scratch directories located on network storage. One is named scratch and the other is named shared-scratch. Access to the scratch directory is limited to members of the same lab while access to the shared-scratch directory is open to all users in Structural Biology. These directories have paths that follow these templates: /net/{lab-name}/scratch/{user-name} and /net/{lab-name/shared-scratch/{user-name}. Although these scratch directories are not backed up like home directories, any deleted files are kept in a special Trash area for up to a week before being permanently removed from the storage system.

 

You can quickly see how much space is occupied in a directory by using the dirsize command. Here’s an example of its use for a directory relion-data and its subdirectories:

stevew@otter:~$ dirsize relion-data
34.1 GB project-A
46.5 GB project-B
79.0GB project-C
103.1GB project-D
218.7GB CombinedResults

Archiving Data

Data that is no longer in active use should be archived to Fortress, RCAC’s archival system. If you have a lot of smaller files (less than 50 MB each), then you should first package your files using tar or a similar command. The recommended way to transfer your data to or from Fortress is through the Globus interface. After logging in to Globus, you can use the Globus File Manager for transferring data. Search for the Purdue Fortress HPSS Archive collection to connect to Fortress and for the Purdue Cryo-EM Gateway to connect to our infrastructure here in Structural Biology.

When archiving data to Fortress, it is recommended to store the data in your group directory so that the lab PI can easily find and access that data in the future. The path for a group directory on Fortress begins with /group followed by the login name of the lab PI. For example, the group directory for Wen Jiang’s lab would be /group/jiang12.

Software

We have a large number of software applications available on our Linux workstations. Most of the common applications can be started by just typing the program name on a command line. For example, to run Coot you would just type coot on a command line and press Enter.

Some applications, though, aren’t accessible without first using the module command. The module command changes the environment in your terminal session so that it is prepared to run a particular program. Typing module available will display all the programs available as modules. Once you find the module for the program you want to run, use module load {program} to set up your environment to run that program, replacing {program} with the program module that you selected. Now everything should be ready for you to run the selected program.

Remote Access

All remote access from off campus must be done through Purdue’s VPN (Virtual Private Network). Please see Purdue IT’s instructions on their web site to learn how to connect to the Purdue VPN.

You may use an SSH client to access our systems if you only need a command line interface. Otherwise, you may use either X2Go or NoMachine NX clients for graphical access. X2Go will create a new desktop session while NoMachine NX attempts to remotely attach to a session currently active locally on the workstation. X2Go is installed by default on our workstations but NoMachine NX is only installed by request as needed. Please note that you should not attempt to connect to a workstation with X2Go if you already have a local desktop session running on that workstation.

Another option for remote graphical access is ThinLinc which requires the purchase of a license. It is similar to X2Go but more capable. Like X2Go, it is important to never remotely connect to a workstation using ThinLinc if you currently have an active local session running on that workstation.