> Home > Scripts > AWK >

AWK

 

The following is a summary of the most common awk statements and features.

 

Command Line

	awk  program  filenames
	awk  -f  program-file  filenames
	awk -Fs
	(sets field separator to string s, -Ft sets separator to tab)

Patterns

	BEGIN
	END
	/regular expression/
	relational expression
	pattern & &  pattern
	pattern || pattern
	(pattern)
	!pattern
	pattern, pattern


Control Flow Statements

	if ( expr)  statement [ else statement]
	if ( subscript  in  array)  statement [ else  statement]
	while ( expr)  statement
	for ( expr ;  expr ;  expr )  statement
	for (  var  in  array )  statement
	do statement  while ( expr)
	break
	continue
	next
	exit [ expr]
	return [ expr]


Input Output

	close(  filename )		close file
	getline				set $0 form next input line, set NF, NR, FNR
	getline <  file			set $0 from next input line of file, set NF
	getline var			set var from next input line, net NR, FNR
	getline var <  file		set var from next input line of file
	print				print current input line
	print expr-list			print expressions
	print expr-list >  file		print expressions to file
	printf fmt, expr-list		format and print
	printf fmt, expr-list  >  file	format and print to file
	system( cmd-line )		execute command cmd-line, return status

In print and printf above, > >  appends to a file, and the | command writes to 
a pipe. Similarly, command | getline pipes into getline. The function getline returns 
0 on the end of a file, -1 on an error.


Functions

	func  name(  parameter list ) {  statement }
	function  name (  parameter list ) {  statement }
	function-name ( expr, expr, ... )


String Functions

	gsub(r,s,t)	substitutes s for r in t globally, returns number of  substitutions
	index(s,t)	returns position of string t in s, 0 if not present
	length(s)	returns length of s
	match(s,r)	returns position in s where r occurs, 0 if not present
	split(s,a,r)	splits s into array a on r, returns number of fields
	sprintf(fmt, expr-list)	returns expr-list formatted according to format string specified by fmt
	sub(r,s,t)	substitutes s for first r in t, returns number of substitutions
	substr(s,p,n)	returns substring of s length n starting at position p


Arithmetic Functions

	atan2(y,x)	arctangent of y/x in radians
	cos(x)		cosine of x, with x in radians
	exp(x)		exponential function of x
	int(x)		integer part of x truncated towards 0
	log(x)		natural logarithm of x
	rand()		random number between 0 and 1
	sin(x)		sine of x, with x in radians
	sqrt(x)		square root of x
	srand(x)	x is new seed for rand()
	

Operators (increasing precedence)
	=   +=   -=   *=   /=   %=   ^=		assignment
	?:					conditional expression
	||					logical or
	& & 					logical and
	~   !~					regular expression match, negated match
	<    < =   >    > =   !=   ==		relationals
	blank					string concatenation
	+   - 					add, subtract
	*   /   %				multiply, divide, modulus
	+   -   !				unary plus, unary minus, logical negation
	^					exponentional
	++   --					increment, decrement
	$					field


Regular Expressions (increasing precedence)

	c			matches no-metacharacter c
	\c			matches literal character c
	.			matches any character except newline
	^			matches beginning of line or string
	$			matches end of line or string
	[abc...]		character class matches any of abc...
	[^abc...]		negated class matches any but abc... and newline
	r1 | r2			matches either r1 or r2
	r1r2			concatenation: matches r1, then r2
	r+			matches one or more r's
	r*			matches zero or more r's
	r?			matches zeor or more r's
	(r)			grouping: matches r


Built-In Variables

	ARGC		number of command-line arguments
	ARGV		array of command-line arguments (0..ARGC-1fR)
	FILENAME	name of current input file
	FNR		input line number number in current file
	FS		input field separator (default blank)
	NF		number of fields in input line
	NR		number of input lines read so far
	OFMT		output format for numbers (default=%.6g)
	OFS		output field separator (default=space)
	ORS		output line separator (default=newline)
	RS		input line separator (default=newline)
	RSTART		index of first character matched by match()
	RLENGTH		length of string matched by match()
	SUBSEP		subscript separator (default=\034")


Limits
Each implementation of awk imposes some limits. Below are typical limits

	100 fields
	2500 characters per input line
	2500 characters per output line
	1024 characters per individual field
	1024 characters per printf string
	400 characters maximum quoted string
	400 characters in character class
	15 open files
	1 pipe