DOC HOME SITE MAP MAN PAGES GNU INFO SEARCH PRINT BOOK
 

/usr/man/cat.1/pcregrep.1.Z(/usr/man/cat.1/pcregrep.1.Z)





NAME

       pcregrep - a grep with Perl-compatible regular expressions.


SYNOPSIS

       pcregrep [options] [long options] [pattern] [path1 path2 ...]


DESCRIPTION


       pcregrep  searches  files  for  character  patterns, in the same way as
       other grep commands do, but it uses the PCRE regular expression library
       to support patterns that are compatible with the regular expressions of
       Perl 5. See pcrepattern(3) for a full description of syntax and  seman-
       tics of the regular expressions that PCRE supports.

       Patterns,  whether  supplied on the command line or in a separate file,
       are given without delimiters. For example:

         pcregrep Thursday /etc/motd

       If you attempt to use delimiters (for example, by surrounding a pattern
       with  slashes,  as  is common in Perl scripts), they are interpreted as
       part of the pattern. Quotes can of course be used on the  command  line
       because they are interpreted by the shell, and indeed they are required
       if a pattern contains white space or shell metacharacters.

       The first argument that follows any option settings is treated  as  the
       single  pattern  to be matched when neither -e nor -f is present.  Con-
       versely, when one or both of these options are  used  to  specify  pat-
       terns, all arguments are treated as path names. At least one of -e, -f,
       or an argument pattern must be provided.

       If no files are specified, pcregrep reads the standard input. The stan-
       dard  input  can  also  be  referenced by a name consisting of a single
       hyphen.  For example:

         pcregrep some-pattern /file1 - /file3

       By default, each line that matches the pattern is copied to  the  stan-
       dard  output, and if there is more than one file, the file name is out-
       put at the start of each line. However,  there  are  options  that  can
       change how pcregrep behaves. In particular, the -M option makes it pos-
       sible to search for patterns that span line boundaries. What defines  a
       line boundary is controlled by the -N (--newline) option.

       Patterns  are  limited  to  8K  or  BUFSIZ characters, whichever is the
       greater.  BUFSIZ is defined in <stdio.h>.

       If the LC_ALL or LC_CTYPE environment variable is  set,  pcregrep  uses
       the  value to set a locale when calling the PCRE library.  The --locale
       option can be used to override this.


OPTIONS


       --        This terminate the list of options. It is useful if the  next
                 item  on  the command line starts with a hyphen but is not an
                 option. This allows for the processing of patterns and  file-
                 names that start with hyphens.

       -A number, --after-context=number
                 Output  number  lines of context after each matching line. If
                 filenames and/or line numbers are being output, a hyphen sep-
                 arator  is  used  instead of a colon for the context lines. A
                 line containing "--" is output between each group  of  lines,
                 unless  they  are  in  fact contiguous in the input file. The
                 value of number is expected to be relatively small.  However,
                 pcregrep guarantees to have up to 8K of following text avail-
                 able for context output.

       -B number, --before-context=number
                 Output number lines of context before each matching line.  If
                 filenames and/or line numbers are being output, a hyphen sep-
                 arator is used instead of a colon for the  context  lines.  A
                 line  containing  "--" is output between each group of lines,
                 unless they are in fact contiguous in  the  input  file.  The
                 value  of number is expected to be relatively small. However,
                 pcregrep guarantees to have up to 8K of preceding text avail-
                 able for context output.

       -C number, --context=number
                 Output  number  lines  of  context both before and after each
                 matching line.  This is equivalent to setting both -A and  -B
                 to the same value.

       -c, --count
                 Do  not  output individual lines; instead just output a count
                 of the number of lines that would otherwise have been output.
                 If  several  files  are  given, a count is output for each of
                 them. In this mode, the -A, -B, and -C options are ignored.

       --colour, --color
                 If this option is given without any data, it is equivalent to
                 "--colour=auto".   If  data  is required, it must be given in
                 the same shell item, separated by an equals sign.

       --colour=value, --color=value
                 This option specifies under what circumstances the part of  a
                 line that matched a pattern should be coloured in the output.
                 The value may be "never" (the default), "always", or  "auto".
                 In  the  latter  case, colouring happens only if the standard
                 output is connected to a terminal. The colour can  be  speci-
                 fied  by  setting the environment variable PCREGREP_COLOUR or
                 PCREGREP_COLOR. The value of this variable should be a string
                 of  two  numbers,  separated by a semicolon.  They are copied
                 directly into the control string for setting colour on a ter-
                 minal,  so it is your responsibility to ensure that they make
                 sense. If neither of the environment variables  is  set,  the
                 default is "1;31", which gives red.

       -D action, --devices=action
                 If  an  input  path  is  not  a  regular file or a directory,
                 "action" specifies how it is to be  processed.  Valid  values
                 are  "read" (the default) or "skip" (silently skip the path).

       -d action, --directories=action
                 If an input path is a directory, "action" specifies how it is
                 to  be  processed.   Valid  values  are "read" (the default),
                 "recurse" (equivalent to the -r option), or "skip"  (silently
                 skip  the path). In the default case, directories are read as
                 if they were ordinary files. In some  operating  systems  the
                 effect  of reading a directory like this is an immediate end-
                 of-file.

       -e pattern, --regex=pattern,
                 --regexp=pattern Specify a pattern to be matched. This option
                 can  be  used multiple times in order to specify several pat-
                 terns. It can also be used as a way of  specifying  a  single
                 pattern  that starts with a hyphen. When -e is used, no argu-
                 ment pattern is taken from the command  line;  all  arguments
                 are treated as file names. There is an overall maximum of 100
                 patterns. They are applied to each line in the order in which
                 they  are  defined until one matches (or fails to match if -v
                 is used). If -f is used with -e, the  command  line  patterns
                 are  matched  first,  followed by the patterns from the file,
                 independent of the order in which these  options  are  speci-
                 fied.  Note that multiple use of -e is not the same as a sin-
                 gle pattern with alternatives. For  example,  X|Y  finds  the
                 first  character in a line that is X or Y, whereas if the two
                 patterns are given separately, pcregrep  finds  X  if  it  is
                 present, even if it follows Y in the line. It finds Y only if
                 there is no X in the line. This really matters  only  if  you
                 are using -o to show the portion of the line that matched.

       --exclude=pattern
                 When pcregrep is searching the files in a directory as a con-
                 sequence of the -r (recursive search) option, any files whose
                 names  match  the pattern are excluded. The pattern is a PCRE
                 regular expression. If a file name matches both --include and
                 --exclude,  it  is  excluded. There is no short form for this
                 option.

       -F, --fixed-strings
                 Interpret each pattern as a list of fixed strings,  separated
                 by  newlines,  instead  of  as  a  regular expression. The -w
                 (match as a word) and -x (match whole line)  options  can  be
                 used with -F. They apply to each of the fixed strings. A line
                 is selected if any of the fixed strings are found in it (sub-
                 ject to -w or -x, if present).

       -f filename, --file=filename
                 Read  a  number  of patterns from the file, one per line, and
                 match them against each line of input. A data line is  output
                 if any of the patterns match it. The filename can be given as
                 "-" to refer to the standard input. When -f is used, patterns
                 specified  on  the command line using -e may also be present;
                 they are tested before the file's patterns. However, no other
                 pattern  is  taken  from  the command line; all arguments are
                 treated as file names. There is an  overall  maximum  of  100
                 patterns. Trailing white space is removed from each line, and
                 blank lines are ignored. An empty file contains  no  patterns
                 and therefore matches nothing.

       -H, --with-filename
                 Force  the  inclusion  of the filename at the start of output
                 lines when searching a single file. By default, the  filename
                 is  not  shown in this case. For matching lines, the filename
                 is followed by a colon and a  space;  for  context  lines,  a
                 hyphen separator is used. If a line number is also being out-
                 put, it follows the file name without a space.

       -h, --no-filename
                 Suppress the output filenames when searching multiple  files.
                 By  default,  filenames  are  shown  when  multiple files are
                 searched. For matching lines, the filename is followed  by  a
                 colon  and  a space; for context lines, a hyphen separator is
                 used. If a line number is also being output, it  follows  the
                 file name without a space.

       --help    Output a brief help message and exit.

       -i, --ignore-case
                 Ignore upper/lower case distinctions during comparisons.

       --include=pattern
                 When pcregrep is searching the files in a directory as a con-
                 sequence of the -r  (recursive  search)  option,  only  those
                 files whose names match the pattern are included. The pattern
                 is a PCRE regular expression. If a  file  name  matches  both
                 --include  and  --exclude,  it is excluded. There is no short
                 form for this option.

       -L, --files-without-match
                 Instead of outputting lines from the files, just  output  the
                 names  of  the files that do not contain any lines that would
                 have been output. Each file name is output once, on  a  sepa-
                 rate line.

       -l, --files-with-matches
                 Instead  of  outputting lines from the files, just output the
                 names of the files containing lines that would have been out-
                 put.  Each  file  name  is  output  once, on a separate line.
                 Searching stops as soon as a matching  line  is  found  in  a
                 file.

       --label=name
                 This option supplies a name to be used for the standard input
                 when file names are being output. If not supplied, "(standard
                 input)" is used. There is no short form for this option.

       --locale=locale-name
                 This  option specifies a locale to be used for pattern match-
                 ing. It overrides the value in the LC_ALL or  LC_CTYPE  envi-
                 ronment  variables.  If  no  locale  is  specified,  the PCRE
                 library's default (usually the "C" locale) is used. There  is
                 no short form for this option.

       -M, --multiline
                 Allow  patterns to match more than one line. When this option
                 is given, patterns may usefully contain literal newline char-
                 acters  and  internal  occurrences of ^ and $ characters. The
                 output for any one match may consist of more than  one  line.
                 When  this option is set, the PCRE library is called in "mul-
                 tiline" mode.  There is a limit to the number of  lines  that
                 can  be matched, imposed by the way that pcregrep buffers the
                 input file as it scans it. However, pcregrep ensures that  at
                 least 8K characters or the rest of the document (whichever is
                 the shorter) are available for forward  matching,  and  simi-
                 larly the previous 8K characters (or all the previous charac-
                 ters, if fewer than 8K) are guaranteed to  be  available  for
                 lookbehind assertions.

       -N newline-type, --newline=newline-type
                 The  PCRE  library  supports  five  different conventions for
                 indicating the ends of lines. They are  the  single-character
                 sequences  CR  (carriage  return) and LF (linefeed), the two-
                 character sequence CRLF, an "anycrlf" convention, which  rec-
                 ognizes  any  of the preceding three types, and an "any" con-
                 vention, in which any Unicode line ending sequence is assumed
                 to  end a line. The Unicode sequences are the three just men-
                 tioned,  plus  VT  (vertical  tab,  U+000B),  FF   (formfeed,
                 U+000C),   NEL  (next  line,  U+0085),  LS  (line  separator,
                 U+2028), and PS (paragraph separator, U+2029).

                 When  the  PCRE  library  is  built,  a  default  line-ending
                 sequence   is  specified.   This  is  normally  the  standard
                 sequence for the operating system. Unless otherwise specified
                 by  this  option,  pcregrep  uses the library's default.  The
                 possible values for this option are CR, LF, CRLF, ANYCRLF, or
                 ANY.  This  makes  it  possible to use pcregrep on files that
                 have come from other environments without  having  to  modify
                 their  line  endings.  If the data that is being scanned does
                 not agree with the convention set by  this  option,  pcregrep
                 may behave in strange ways.

       -n, --line-number
                 Precede each output line by its line number in the file, fol-
                 lowed by a colon and a space for matching lines or  a  hyphen
                 and  a space for context lines. If the filename is also being
                 output, it precedes the line number.

       -o, --only-matching
                 Show only the part of the line that  matched  a  pattern.  In
                 this  mode,  no context is shown. That is, the -A, -B, and -C
                 options are ignored.

       -q, --quiet
                 Work quietly, that is, display nothing except error messages.
                 The  exit  status  indicates  whether or not any matches were
                 found.

       -r, --recursive
                 If any given path is a directory, recursively scan the  files
                 it  contains, taking note of any --include and --exclude set-
                 tings. By default, a directory is read as a normal  file;  in
                 some  operating  systems this gives an immediate end-of-file.
                 This option is a shorthand  for  setting  the  -d  option  to
                 "recurse".

       -s, --no-messages
                 Suppress  error  messages  about  non-existent  or unreadable
                 files. Such files are quietly skipped.  However,  the  return
                 code is still 2, even if matches were found in other files.

       -u, --utf-8
                 Operate  in UTF-8 mode. This option is available only if PCRE
                 has been compiled with UTF-8 support. Both patterns and  sub-
                 ject lines must be valid strings of UTF-8 characters.

       -V, --version
                 Write  the  version  numbers of pcregrep and the PCRE library
                 that is being used to the standard error stream.

       -v, --invert-match
                 Invert the sense of the match, so that  lines  which  do  not
                 match any of the patterns are the ones that are found.

       -w, --word-regex, --word-regexp
                 Force the patterns to match only whole words. This is equiva-
                 lent to having \b at the start and end of the pattern.

       -x, --line-regex, --line-regexp
                 Force the patterns to be anchored (each must  start  matching
                 at  the beginning of a line) and in addition, require them to
                 match entire lines. This is equivalent  to  having  ^  and  $
                 characters at the start and end of each alternative branch in
                 every pattern.


ENVIRONMENT VARIABLES


       The environment variables LC_ALL and LC_CTYPE  are  examined,  in  that
       order,  for  a  locale.  The first one that is set is used. This can be
       overridden by the --locale option.  If  no  locale  is  set,  the  PCRE
       library's default (usually the "C" locale) is used.


NEWLINES


       The  -N (--newline) option allows pcregrep to scan files with different
       newline conventions from the default.  However,  the  setting  of  this
       option  does not affect the way in which pcregrep writes information to
       the standard error and output streams. It uses the  string  "\n"  in  C
       printf()  calls  to  indicate newlines, relying on the C I/O library to
       convert this to an appropriate sequence if the  output  is  sent  to  a
       file.


OPTIONS COMPATIBILITY


       The majority of short and long forms of pcregrep's options are the same
       as in the GNU grep program. Any long option of  the  form  --xxx-regexp
       (GNU  terminology) is also available as --xxx-regex (PCRE terminology).
       However, the --locale, -M, --multiline, -u,  and  --utf-8  options  are
       specific to pcregrep.


OPTIONS WITH DATA


       There are four different ways in which an option with data can be spec-
       ified.  If a short form option is used, the  data  may  follow  immedi-
       ately, or in the next command line item. For example:

         -f/some/file
         -f /some/file

       If  a long form option is used, the data may appear in the same command
       line item, separated by an equals character, or (with one exception) it
       may appear in the next command line item. For example:

         --file=/some/file
         --file /some/file

       Note,  however, that if you want to supply a file name beginning with ~
       as data in a shell command, and have the  shell  expand  ~  to  a  home
       directory, you must separate the file name from the option, because the
       shell does not treat ~ specially unless it is at the start of an  item.

       The  exception  to  the  above is the --colour (or --color) option, for
       which the data is optional. If this option does have data, it  must  be
       given  in  the first form, using an equals character. Otherwise it will
       be assumed that it has no data.


MATCHING ERRORS


       It is possible to supply a regular expression that takes  a  very  long
       time  to  fail  to  match certain lines. Such patterns normally involve
       nested indefinite repeats, for example: (a+)*\d when matched against  a
       line  of  a's  with  no  final  digit. The PCRE matching function has a
       resource limit that causes it to abort in these circumstances. If  this
       happens, pcregrep outputs an error message and the line that caused the
       problem to the standard error stream. If there are more  than  20  such
       errors, pcregrep gives up.


DIAGNOSTICS


       Exit status is 0 if any matches were found, 1 if no matches were found,
       and 2 for syntax errors and non-existent or inacessible files (even  if
       matches  were  found in other files) or too many matching errors. Using
       the -s option to suppress error messages about inaccessble  files  does
       not affect the return code.


SEE ALSO


       pcrepattern(3), pcretest(1).


AUTHOR


       Philip Hazel
       University Computing Service
       Cambridge CB2 3QH, England.


REVISION


       Last updated: 16 April 2007
       Copyright (c) 1997-2007 University of Cambridge.

                                                                   PCREGREP(1)
See also pgrep(1)

Man(1) output converted with man2html