Managing Input and Output
This chapter explains the basics of managing file input and output when retrieving information from one or more PDF documents.
You will learn how to:
- Retrieve information from an unsecured file — Basic command line for retrieving information from a PDF that does not have security set on the file
- Obtain information from an encrypted file (-d option) — Demonstrates basic command line for retrieving information from a PDF file that is encrypted
- Gather information from multiple files — Retrieve information from more than one file at a time
- Include the file name in the results (-name option) — This option reports the name of each PDF file. Helpful when running multiple files at one time
- Save the results to a text file (-o option) — Write the report information to a text file instead of just to STDOUT
For purposes of illustration, the examples in this chapter focus on managing input and output for the retrieval of document description information. (Document description information is discussed in Getting Document Information.) However, the input and output functions described here apply to all APGetInfo information gathering options, not just to obtaining document description information.
No special action is required to retrieve information from a PDF document that has no security applied to it. Just enter the apgetinfo command, followed by one or more options, and then the path name of the input file.
$apgetinfo [options] inPDFFile
$apgetinfo -info C:\Appligent\APGetInfo\samples\ApUtilsSample.pdf
$./apgetinfo -info /Appligent/APGetInfo/samples/ApUtilsSample.pdf
The figure below shows the results of running apgetinfo with the -info option on the ApUtilsSample.pdf file supplied with APGetInfo. APGetInfo retrieves document description information from the document and sends it to the standard output device (STDOUT), the screen.
Note: The same command was used in Retrieving Document Description Information
To get information about an encrypted file, you must supply its User password. (The User password, which is also called an Open password, is required to open the document.)
Note: An Owner password (also called a Permissions or Master password) will not work correctly with APGetInfo. You must supply a User password.
$apgetinfo -d userpassword [other options] inPDFFile
$apgetinfo -d userpass -info C:\Appligent\APGetInfo\samples\ApUtilsSampleEnc.pdf
$./apgetinfo -d userpass -info /Appligent/APGetInfo/samples/ApUtilsSampleEnc.pdf
This command uses the -d option to supply the User password (userpass) required to open ApUtilsSampleEnc.pdf. It then retrieves document description information from the document and directs the output to the screen (STDOUT). The output is shown in the figure below. It is the same as the output for the unsecured ApUtilsSample.pdf file shown in the figure above, except for the encryption information, the Changing ID and file size. (The ApUtilsSampleEnc.pdf file was created by saving ApUtilsSample.pdf as ApUtilsSampleEnc.pdf, and then creating a User password and adding other security settings.)
Note: If you fail to specify the -d option and a User password for an encrypted document, you will receive this error message:
Error: this document requires a password
You can retrieve information from more than one file at a time by specifying the path name for each input file. If you want to retrieve information from all of the PDF files in a directory, you can use the *.pdf wildcard specification as a shortcut.
If any of the input files are encrypted, you must also specify a User password. However, only one User password can be included on a command, and no information will be retrieved from encrypted files that have a different password. A separate command must be submitted for each User password.
$apgetinfo [options] inPDFFile1 [inPDFFile2...]
- Using full path names:
C:\Appligent\APGetInfo\apgetinfo -info -d userpass C:\Appligent\APGetInfo\samples\ApUtilsSample.pdf C:\Appligent\APGetInfo\samples\ApUtilsSampleEnc.pdf
- Substituting a wildcard shortcut:
c:\Appligent\APGetInfo\apgetinfo -info -d userpass C:\Appligent\APGetInfo\samples\*.pdf
- Using full path names:
$./apgetinfo -info -d userpass /Appligent/APGetInfo/samples/ApUtilsSample.pdf /Appligent/APGetInfo/samples/ApUtilsSampleEnc.pdf
- Substituting a wildcard shortcut:
$./apgetinfo -info -d userpass /Appligent/APGetInfo/samples/*.pdf
The figure below shows the results of running apgetinfo with the -info option and the -d option on the ApUtilsSample.pdf and ApUtilsSampleEnc.pdf files supplied with APGetInfo. APGetInfo retrieves document description information from both of these documents and sends the results to the standard output device (STDOUT), the screen. The -d option is required to open the encrypted ApUtilsSampleEnc.pdf file. This option is ignored by the ApUtilsSample.pdf file, which is unsecured and does not need a password.
When you run multiple commands or include multiple input files on one command, it can be helpful to include the name of each input file with its results. You can do this with the -name option.
$apgetinfo -name [other options] inPDFFile
$apgetinfo -name -info -d userpass C:\Appligent\APGetInfo\samples\*.pdf
$./apgetinfo -name -info -d userpass /Appligent/APGetInfo/samples/*.pdf
The figure below shows the results of running apgetinfo with the -name option, -info option, and -d option on the two sample files supplied with APGetInfo: ApUtilsSample.pdf and ApUtilsSampleEnc.pdf. (The -d option is required because ApUtilsSampleEnc.pdf is an encrypted file that can only be opened with a User password.) The output is essentially the same as that shown in the figure above, except for the addition of a file name at the beginning of each set of results. The file names, included because of the addition of the -name option, clearly indicate which file each set of results belongs to.
Note: These results were obtained on a Windows platform, as indicated by the C: root directory and backward slashes in the file names.
By default, APGetInfo directs all output to the screen (STDOUT). To write your output to a file instead, specify the -o option, followed by the name of a text file. If you want your output file to go to a particular directory, be sure it exists beforehand. APGetInfo will not create a new directory.
$apgetinfo -o outFile.txt [other options] inPDFFile
$apgetinfo -o C:\Appligent\APGetInfo\samples\myoutput.txt -name -info -d userpass C:\Appligent\APGetInfo\samples\*.pdf
$./apgetinfo -o /Appligent/APGetInfo/samples/myoutput.txt -name -info -d userpass /Appligent/APGetInfo/samples/*.pdf
The results are exactly the same as those shown in the figure above, except they are saved in myoutput.txt instead of being displayed on the screen.