searching in pdf files using grep : pdfgrep

Just as we use grep to search for patterns in a text file we can use pdfgrep to search for strings in a pdf file. In debian based systems we can install the package from the package manager or from the terminal using

sudo apt-get install pdfgrep

Once installed we can use this command from the terminal to search for the strings in pdf files. The syntax for use of the command is

$ pdfgrep [option]

Let us say we are searching for string “ioctl” in a pdf file name ch03.pdf (which is the third chapter from Linux device drivers book) .

$ pdfgrep ioctl ch03.pdf int (*ioctl) (struct inode *, struct file *, unsigned int, unsigned long); The ioctl system call offers a way to issue device-specific commands (such as formatting a track of a floppy disk, which is which is neither reading nor writing). Additionally, a few ioctl commands are recognized by the kernel without referring to to the fops table. If the device doesn’t provide an ioctl method, the system call returns an error for any request that method, the system call returns an error for any request that isn’t predefined (-ENOTTY, “No such ioctl for device”). = scull_llseek, .read = scull_read, .write = scull_write, .ioctl = scull_ioctl, .open = scull_open, .release = scull_release, .read = scull_read, .write = scull_write, .ioctl = scull_ioctl, .open = scull_open, .release = scull_release, }; check this field for read/write permission in your open or ioctl function, but you don’t need to check permissions for read or by changing both the current and default values using ioctl at runtime. Using a macro and an integer value to allow both in ) here, and the rest in the section “Using the ioctl Argument” in Chapter 1; they use some special,

It lists out all the lines that contain the string “ioctl”. To make the output look more easier to read we can prefix each line with the page number on which it occurs using the option “-n”.

$ pdfgrep -n ioctl ch03.pdf 10: int (*ioctl) (struct inode *, struct file *, unsigned int, unsigned long); 10: The ioctl system call offers a way to issue device-specific commands (such as formatting a track of a floppy disk, which 10: is neither reading nor writing). Additionally, a few ioctl commands are recognized by the kernel without referring to the 10: to the fops table. If the device doesn’t provide an ioctl method, the system call returns an error for any request that 10: the system call returns an error for any request that isn’t predefined (-ENOTTY, “No such ioctl for device”). 12: .llseek = scull_llseek, .read = scull_read, .write = scull_write, .ioctl = scull_ioctl, .open = scull_open, .release = 12: .read = scull_read, .write = scull_write, .ioctl = scull_ioctl, .open = scull_open, .release = scull_release, }; 12: check this field for read/write permission in your open or ioctl function, but you don’t need to check permissions for 21: or by changing both the current and default values using ioctl at runtime. Using a macro and an integer value to allow 23: in ) here, and the rest in the section “Using the ioctl Argument” in Chapter 1; they use some special,

If we want to count the number of occurrences instead of viewing the lines on which they appear we can use the option “-c”

$ pdfgrep -c ioctl ch03.pdf 10

We can also prefix each line of output with the name of the file in which the line appears, which is useful when searching in multiple files, by using the option -H.


Tags: , ,
Copyright 2017. All rights reserved.

Posted June 13, 2013 by Tux Think in category "Linux