For a more thorough and precise description check out the Perl website or one of the numerous Perl reference books.
Table of contents
You can either run commands interactively from the perl interpretter (/usr/bin/perl) or
put your perl commands in a file that begins with
#!/usr/bin/perl
(that's version 5.8 at the time of this writing) or (for version 5.004)
#!/usr/local/bin/perl
You can also determine which version you're working with using perl -v
Lots of information, and the latest downloads, are available through perl.com .
General language features:
;
except for
compound statements ending with a }
#
Multiline comments can be begun by placing =
at the start
of a line, all lines are ignored until and including one which begins
with =cut
As an example,
localtime
function is called in a
scalar context it will return a scalar value: the number of seconds
since January 1, 1970
$numsecs = localtime();
@datearray = localtime();
Identifiers for scalar variables begin with a $
symbol.
Strings:
"variable foo has value \n \t $foo"\n and \t are interpretted as newline and tab characters, and the value of variable foo would be substituted for the
$foo
token
\'
and \\
,
which are used to include ' and \ in a string
('a','blah','foo') (2,4,5,9) etcYou can create a list from a text block using the
qw//
operator:
qw/a b c/ # creates list ('a','b','c')
The values of an array can be set from a list, e.g.
@myarray = (1, 7, 19, 3); @otherarray = ("foo","blah","etc");Accessing individual array elements is carried out through the [] subscript delimiters, e.g.
print "$myarray[0]";(Note that the array element is a scalar, so we're using $myarray[0] not @myarray[0].)
The array elements are subscripted starting from 0, and if a negative subscript is used it references backwards from the end of the array.
print "$myarray[-1]"; # prints the last element of the arrayIf an element is accessed beyond the current known array bounds the array is dynamically resized to hold the new bounds (hence no range checking).
%
symbol.
Hash variables consist of a set of key/value pairs, the key is used to access elements rather than a subscript.
Hash values can be set using a list of pairs of values, e.g.:
%colorvalues = ( 'red', 1, 'green', 2, 'blue', 3 ); print "$colorvalues{'red'}"; # prints value 1An alternative notation that has the same effect but might be clearer from a readability viewpoint is using the
=>
operator:
%colorvalues = ( 'red' => 1, 'green' => 2, 'blue' => 3 ); print "$colorvalues{'red'}"; # prints value 1As with arrays, note that the hash reference uses the
$
operator,
not the %
, since we're accessing the scalar content within the hash.
A reference is a scalar variable (hence prefixed with $), and to create a reference use the \ operator before the target location.
E.g. $myptr = \$x;
sets up myptr as a reference to variable x
To dereference a variable such as myptr, precede it with the prefix
for the desired data type, e.g. use $$myptr, @$myptr, or %$myptr
as appropriate depending on whether myptr references a scalar, an array,
or a hash.
The ref
function can be used to determine which kind of
data is being referenced, it returns one of the following:
SCALAR ARRAY HASH CODE GLOB REF FALSE(False is only returned if you apply ref to something which is not a reference)
You may explicitly declare a variable, with either
dynamic scope (declaring a local
variable)
or lexical scope (declaring a my
variable).
Lexical scope means the variable is truly local to the block within which they are declared
Dynamic scope means the variable is also visible to any function called from the block in which they are declared (and then to any function called by that function, etc).
$_
or $ARG
: the default string, the assumed source or
destination
of a string in many places if you fail to otherwise specify it
$PID
: the current process id
$UID
: the user id for this process
$BASETIME
: the time the prgram began running (seconds
since Jan 1/70)
$0
or $PROGRAM_NAME
: the name of this program
$DEBUGGING
: the current debugging flags
$PERL_VERSION
or $]
: the current version of Perl
@ARGV
: the array of command-line arguments,
@ARGV[i]
is the ith
command line argument, 0-based
%ENV
: the current environment settings
$MATCH
: the string matched by the last successful
pattern match
$INPUT_LINE_NUMBER
or $.
or $NR
: the current input line number
on the last input file read
$/
or $RS
or $INPUT_RECORD_SEPERATOR
: the input record seperator
(newline by default, if set to the null string it uses blank lines as seperators)
ARGV, STDERR
, STDIN
, and STDOUT
if (expression) { ... } elsif (expression) { ... } elsif (expression) { ... } else { ... }
unless (expression) { ... } else { ... }
goto
command:
JUMPHERE: ... ... goto JUMPHERE;Exception: the goto cannot be used to jump inside a structure that requires some form of initialization, such as a for loop or a subroutine.
while (expression) { ... } do { ... } while (expression);
until (expression) { ... } do { ... } until (expression);
for ($index = 0; $index < 10; $index++) { ... }
# print out a list of colors foreach $current ("blue", "red", "green") { print "$current "; }
last
statement.
You can skip to the next iteration of a loop
using the next
statement.
You can restart execution of the loop block without evaluating
the conditional again by using the redo
statement.
If the loop structure is labeled, you can supply the label as an argument to the next, last, or redo statements - allowing you to jump across more than one level of loop.
MYLOOP: while (expression) { ... while (expression2) { ... if (expression3) { next MYLOOP; } ... } ... }
The continue block is executed after a next
statement in the main loop body, but is not executed following
a last
statement in the main loop body.
, =>
comma/arrow-comma
\
for reference operator
**
exponentiation
<=>
signed comparison
and or not xor
logic operators
~
for bitwise not
.. ...
range
.
string concatenation
lt, gt, le, ge, eq, ne, cmp
string comparison operators
=~
pattern match
!~
pattern non-match
associativity
left non right right left left left left non non left left left left non right right left right left left |
operators (high at top)
-> ++ -- ** ! ~ \ unary+ unary- =~ !~ * / % + - . << >> < > <= >= lt gt le ge == != <=> eq ne cmp & | ^ && || .. ... ?: = += -= *= **= .= /= %= &= |= ^= <<= >>= &&= ||= , => not and or xor |
foo(x, y, z)
They can also be called omitting the brackets, e.g. foo x y z
In the callee, the parameters are accessible through the array named
@_
You can also call subroutines using the ampersand, e.g. &foo
which has the effect of passing the caller's @_
along to
the callee.
When passing arrays or hashes as parameters you'll usually want to pass references to
the item, which is done by using the \
operator before the array/hash name,
e.g. foo(\@myarray, \@anotherarray);
When declaring subroutines, the format is
sub routinename { ... }As an option, when declaring the subroutine you can indicate the types of parameters it expects to receive using the symbols
$ @ % & *
for scalar, list, hash, subroutine, and typeglob
respectively.
For example, if you expect a scalar, an array, and another scalar, then the prototype might look like:
sub routinename ($@$) { ... }
To create local variables that are lexically scoped (not visible externally)
precede their declaration with my
, and for dynamically scoped
variables precede their declaration with local
.
The return
statement allows the return of scalars or lists
to the calling routine.
($mystring =~ /blah/)
is true if
the mystring variable contains "blah"
$mystring =~ tr/pattern1/pattern2/
goes through mystring and replaces all the characters from pattern1 with
the corresponding characters from pattern2
$mystring =~
s/oldpattern/newpattern/
goes through mystring and replaces the old pattern (if found) with the
new pattern
More complex RE's can be built up based on simpler ones, some of the operators or metacharacters available are:
|
for the "or" of two expressions
( )
for grouping expressions, e.g. ("blah" |
"foo")
would match either string "blah" or string "foo"
.
is a wildcard matching any single character
\
for including special characters, e.g.:
\a alarm (bell) \f formfeed \n newline \e escape \r carriage return \x7f (any hex value for 7f) ascii value \t tab \cx control-x (any char for x)
^
requires a match at the beginning of the string
$
requires a match at the end of the string
*
repeat the previous element 0 or more times
+
repeat the previous element 1 or more times
?
repeat the previous element 0 or 1 times
[ ... ]
match any element enclosed, can use dashes
to indicate a range, e.g. [a-zA-Z0-9]
for alphanumeric
Special note: if ^
is placed within the [ ]'s it
negates the character class following, e.g. [^0-9]
means
anything except a digit
To open a file for input, you allocate a file handler with the open function. To access one line of data at a time until end-of-file, use the < > symbols around the file handler and the $_ variable, e.g.:
open(INFILE, "filename.txt"); while (<INFILE>) { print "$_ \n"; }If we wanted to combine that with some of the regular expression tests, we could use something like this (prints all lines containing the text "foo")
open(INFILE, "filename.txt"); while (<INFILE>) { print "$_ \n" if /foo/; }To open a file for output, prefix the file name with a < symbol, e.g.:
open(OUTFILE, ">filename.txt"); print OUTFILE "blah blah blah";To concatenate to an existing file, using two >> symbols, e.g.:
open(OUTFILE, ">>filename.txt"); print OUTFILE "blah blah blah";The open function returns true if the open was successful, false otherwise.
Formatting of print output can be done using format templates and field holders:
format myreport = The account holder is @<<<<<<<<<< and my account balance is @####.## $name, $cashThe < symbols are place holders for a left-justified field, and the # symbols are for a fixed-precision numeric field.
For right-justified use > symbols, and for centered fields use | symbols.
die
is used as a mechanism to test for errors
and terminate the program with both an error message and a returned error status.
die "blah blah blah";
terminates the program after doing a print
of "blah blah blah". The status value returned is whatever value is currently
in variable $!
, usually the result of a preceding failed command.
For instance, combined with the file opening code from above:
open(OUTFILE, ">>filename.txt") or die "could not open filename.txt"; print OUTFILE "blah blah blah";
warn
can similarly be used to generate warning messages
without terminating the program, e.g.
open(OUTFILE, ">>filename.txt") or warn "could not open filename.txt";
CGI scripts with Perl
As with Python, the script output in Perl is typically generated with print statements, e.g.:
print "Content-type: text/html\n\n" "<html><body>\n" "Hi!\n" "</body></html>\n"We can determine what kind of method was used to submit the data (e.g. GET or POST), and then read the query string into a variable for later parsing:
# find out the method used (it is stored as an environment variable) $method = $ENV{'REQUEST_METHOD'}; # get the query string from an environment variable if # the GET method was used if ($method eq "GET") { $querystring = $ENV{'QUERY_STRING'}; } # read the query string from the request body if # the POST method was used elsif ($method eq "POST") { read(STDIN, $querystring, $ENV{'CONTENT_LENGTH'}); } # if any other method was used then this # script is the wrong place to be! else { printf "ERROR - illegal method used to call script"; exit(1); }Having extracted the query string, we can split it up into a collection of name/value pairs, then process them one at a time:
@pairs = split(/&/, $querystring); foreach $pair (@pairs) { ($name, $value) = split(/=/, $pair); # now do something with the name and value }Again, this is only the tip of the iceberg, but it provides a starting point for capturing and using form data.
Using cookies from Perl
Here is a short example illustrating the use of cookies in Perl with the CGI module, I'll fill in more details as time permits.
#! /usr/bin/perl use CGI qw(:standard); use CGI::Cookie; # we can look up the existing cookies, e.g.: # $name = cookie("username"); # if ($name) { # ... the cookie has been set previously ... # } else { # ... no cookie named "username" has been set yet ... # } # here we set up a form to grab a new username, # then we submit the form (to this same script) # for processing unless( param() ) { # this is the first time we've hit the form, # so generate the form to get the cookie values print header(), start_html("User login form"), h1("User login form"), start_form(), p("Enter your username", textfield("NAME")), submit(), end_form(), end_html(); } else { # the form has been submitted, # so process it and generate the cookie # first lookup the submitted username $name = param("NAME"); # next set up the new cookie values $values = CGI::Cookie->new( -name => "username", # the cookie name -value => $name, # the cookie value (the user's name) -expires => "+30s", # it expires 30 seconds from now -path => "/home/someplace", # the path associated with the cookie ); # echo the information to the user print header( -cookie -> $values), start_html("ThanksScreen"), h3("Thanks for submitting the User login data"), p("Your username is ", b($name), " and is valid for the next 30 seconds"); end_html(); }
Method 1: (Hack)
If a server doesn't have the Perl DBI module installed then the typical Perl+MySQL options aren't available, and one has to work through operating system calls to the mysql client, as shown below.
In this case, we will create a command string representing the call to the mysql client, and follow the standard Perl procedures for opening a pipe to direct the execution results to an appropriate handle in our Perl script. We can then process through the results of the call one line at a time.]
#! /usr/bin/perl # # set up the connection and query information $query = "\"SHOW DATABASES;\""; $user = "whoever"; $pwd = "somethingclever"; $executable = "/usr/local/mysql/bin/mysql"; # build the mysql client command $cmd = "$executable -u$user -p$pwd -e$query"; # run the command and pipe the input to our handle, # quit with an error message if we couldn't connect if (!open(CMDPIPE, "$cmd |")) { die "Could not connect to the server\n"; } # run through the command output, one line at a time while (<CMDPIPE>) { print "$_"; } # close the command's pipe close(CMDPIPE);
Method 2: (Preferred)
If a server has the Perl DBI module installed then this can significantly improve access to a MySQL database.
Below is a short example of connecting to a MySQL database and running a query.
#! /usr/bin/perl5.8.4 # use CGI; use DBI; # declare the variables to hold the necessary # user and host information to establish a connection # here we've hard-coded them in the script itself, # which is actually undesirable from a security standpoint my $host = 'localhost'; my $db = 'some_database_name'; my $user = 'whoever'; my $pwd = 'somethingclever'; my $sock = '/tmp/mysql.sock'; # establish the connection my $dbhandle = DBI->connect("dbi:mysql:dbname=$db;host=$host;mysql_socket=$sock", "$user", "$pwd"); # prepare a "SHOW TABLES;" query my $query = $dbhandle->prepare("SHOW TABLES;"); # run the query $query->execute(); # grab rows of results from the query output # until you run out of them # (note here we just print the first element of # each nextrow array, since we know the "show tables" # query just produces one column per row) while (my @nextrow = $query->fetchrow_array()) { printf("%s\n", $nextrow[0]); } # finish the query $query->finish(); # disconnect from the server/database $dbhandle->disconnect();If we want to do updates, inserts, or replaces from a Perl script (rather than a simple query) then we use the do command, e.g.
# do takes as parameters the SQL command, # the processing attributes (undef) # and the series of data values to be inserted # it returns the number of rows affected my $numrows = $dbhandle->do("INSERT INTO tablename (colA, colB, colC) VALUES (?, ?, ?)", undef, valueA, valueB, valueC);If we want to do transaction handling from a Perl script, we need to remember to turn autocommit off, and either commit or rollback after the do statement, e.g.
# turn autocommit off $dbhandle->{AutoCommit}=0; # do the transaction ..... # commit or rollback #dbhandle->commit();