Site Map Online Directory
  Search Information Technology   Northwestern University  
YOU ARE HERE >   NUIT > SSCC > Bulletins > Using PGP Encryption in Pipelines
Using PGP Encryption in Pipelines

About the SSCC

Cluster Report (NU Restricted)

HOWTOs

Bulletins

Statistical Software

Statistical Software Manuals

Additional Resources

Migration Information

Social Science Data Services

Kellogg Research Computing

Depot File Service

Improving Social Science Research Computing (PDF)

Contact List

Services

Get Connected

Support

Educational Resources

NUIT

Using PGP Encryption in Pipelines


This bulletin explains how to work with data files encrypted with PGP without copying the decrypted file to disk. Such an approach enhances the security of your data analysis.


PGP stands for "Pretty Good Privacy" and is also the name of a program (pgp) that provides strong encryption and authentication services that make it possible to protect your data so that only intended co-workers can read it. You can also digitally sign data, which ensures its authenticity and that it has not been altered along the way.

Extensive support for email encryption, decryption and signature is also provided by PGP, but those features are not within the domain of this bulletin.

Complete documentation for PGP may be found in the directory /usr/doc/pgp-6.5.8 in the form of PostScript and Adobe PDF files. Read the PGPCmdLineGuide file to get started.

Put your passphrase into an environment variable

Your passphrase is the secret key used to encrypt and decrypt your file. You should choose your passphrase carefully so that you can remember it without having to write it down. It should be composed of letters of upper and lower case, as well as special characters, and should be much longer than the traditional 6 to 8 character passwords required by the NU NetID system.

Assuming you're using the GNU Bourne-Again SHell (bash), type the command

export PGPPASS="Your PGP Passphrase Should be Long"

Provide your own long passphrase in the quoted argument.

This puts your passphrase into the shell environment variable named PGPPASS. If you don't do this, pgp will prompt you at the keyboard for the passphrase -- and that won't work when pgp is used in a pipeline expression.

This export command should always be typed by hand, and should never be put into a file. Otherwise, anyone who can access that file will obtain your passphrase.

This method of putting your passphrase into an environment variable displays it on your screen. See "Entering your passphrase without displaying it on the screen" below for a more secure way to create your PGPPASS variable.

Remove your passphrase from the environment variable

The PGPPASS shell environment variable is inherited by shells and programs started by your current shell. This means that the programs you run after creating your PGPPASS variable can examine that variable and capture its value -- i.e. capture your passphrase. To prevent that, merely undefine your PGPPASS variable with the command

unset PGPPASS

Encrypt a file

pgp -c data

The encrypted file is written to data.pgp and the original file data remains unchanged. You'll want to remove your original file after you check your work and prove that you can decrypt it.

This form of encryption does not involve a public key, so you don't have to spend the time to create one and save it someplace special.

Decrypt your file with pgp

To decrypt that file and decompress it in the same step as a pipeline suitable for use within SAS, for example, do the following:

First, if your PGPPASS variable is undefined, define it (again, using your own secret passphrase).

export PGPPASS="Your PGP Passphrase Should be Long"

Then create a UNIX pipeline expression of the following form:

pgp -f < data.pgp 2>Errors

In this expression, pgp reads and decrypts the file named data.pgp and writes the decrypted records to standard output

Some extraneous pgp messages go into the file Errors (redirected from standard error), and that file will also have error messages on it if you do get any.

The decrypted file will emerge on standard output, which SAS would then consume as a data file if you create a SAS FILENAME command containing your pipeline expression.

Entering your passphrase without displaying it on the screen

You can create your PGPPATH variable securely with the command:

source /sscc/opt/local/bin/getpgp

Example SAS program with the pgp pipeline

The following SAS FILENAME command associates a pipeline constructed in the manner above with the SAS filename people. The subsequent DATA step named dstep reads data records from the input stream named people which of course is the output of our pipeline of UNIX commands that decrypt and decompress the file named data.gz.pgp.

FILENAME people PIPE "pgp -f < data.gz.pgp 2>Errors | gzcat" ;

DATA dstep; INFILE people OBS=100 LRECL=3249 ;
INPUT blah blah;
RUN;


For further information

Complete documentation for PGP is found in the directory /usr/doc/pgp-6.5.8 in the form of PostScript and Adobe PDF files. Read the PGPCmdLineGuide file to get started.

See the ``SAS Companion for the UNIX Environment and Derivatives'' in the section ``Reading from and Writing to UNIX Commands'' on page 119 of the Version 6 First Edition.


Computer and Network Security

E-mail, NetID, and Password

Hardware

Listserv

Network Services

NUTV and TV Services

Policies and Guidelines

Reserve a Facility

Service Status

Software

Telephone Services

Videoconferencing Services

Web Publishing Services

Webcasting

Webmail

Off-campus Connections

Safe access to the NU Network (VPN)

Wired Connection

Wireless access

Departmental Desktop and Server Support

NUIT Help

Student Support

Computer Labs

Course Management System (Blackboard)

Learning Opportunities

Smart Classrooms

about NUIT

Job Opportunities in NUIT

News, Press, and Publications

What's New & Changing with Technology @ NU?