2012/05/13

use Perl; Becoming a Friar on PerlMonks

Over the years, I've joined probably close to a few hundred different technical sites and mailing lists where the focus is on a mutual help system between peers for problems we're stuck on. These lists/sites include network engineering, open source Operating Systems, infrastructure hardware configuration and programming.

Each of these sites (I'll use the term 'sites' to refer to both mail lists and websites from here on out) have their own culture, etiquette and rules. Some even go as far as to have a formal Charter that describes what is permitted and how the site works. Others have an informal Charter that is created and modified on-the-fly by the members who have been there the longest and who have the most respect.

I've always been the type of person who when they join a new site to seek out a Charter, read the FAQ of how the site works, what is expected of me and what I should expect from others. I then 'lurk' the site for at least a few days to get an idea of how others interact with each other before I consider posting.

PerlMonks(PM) is a site I joined in mid 2009, and is one of my favourite places to live online. I always have a Firefox tab open to PM on all the computers I work on. Although I had been coding Perl since ~2k3 as a side-effect of my then network engineering/systems administration job, I found everything I needed through search. It got to a point where I felt I could begin to help others because my knowledge had started gaining steam. PerlMonks is a site dedicated to helping both newcomers to the Perl programming language, as well as providing aid to those of us who are a fair amount more experienced.

The site is home to many of the most experienced, fluent and elite Perl programmers in the world. Learn the ropes and you'll get along great. You may even get a dose of criticism that shatters your ego, which of course is good from time-to-time :)

Respect gained is respect earned, and I feel I've done a reasonable job on PM. PM has a 'leveling' system, by which you get 'points' (votes) for each quality post you provide. At certain point levels, you get privileges to perform certain site administration functions due to your dedication to the site. As I approached level 9 (Friar), I knew I'd be open to start using the Consideration system. With this, at about 200 points before I reached that level, I used the same level of diligence I do when I first sign up, and I researched what was expected of me with this new authority. Here's some of the information I read during that journey.

First I read about the Voting and Experience system itself. How it works, how to vote, the amount of experience needed to level up etc. Although I had a good understanding of this already, I wanted to ensure I knew the details correctly.

I then moved on to a brief write about The Role of XP in PerlMonks, and on to the What is moderation? and straight to How do I moderate?. So, I'd actually have full right to either Approve a new post, or put it right on the PerlMonks front page where non-members entering the site would see it.

By this time I had enough knowledge to learn more specifics. I then went and looked up exactly what a Friar is, isn't and most importantly what they can actually do. I knew about moderation already, but there's something new... Consideration. This is where level 9 and up can 'consider' posts to be either edited or deleted. We can't do this directly... we 'consider' a post for the above changes, write a 'this is what and why' blurb, and then fellow monks vote whether the consideration is valid or idiotic. I saw immediately a lot of responsibility with this power. Naturally, I knew it would take me time to follow along to see what others were considering and why before I dared try it, and still felt there was more to learn.

Consideration and moderation happens through the Approval Nodelet which is only available to level 9 (Friar) and above. Through that page I found exactly the information I was after in the first place, which was How do I use the power of consideration wisely?. Due to my experience on other sites and the standard etiquette I follow as a general rule, I assumed most of that information anyways, but it was nice to see it documented.

In closing, I've since become a Friar and am a fair bit past that now. I believe I've approved a few posts, voted on a few considerations, but have yet to consider. I'm still hanging back following the flow from the more experienced members. So, if you've been on PerlMonks and enjoy it, inevitably you'll become Friar, and you too will find yourself with these responsibilities and powers. The links above provides a minimal amount of reading you should do before you get to that point, and as always, when you perform an action, think about how you would feel if someone did it to you if the roles were reversed.

Cheers,

-stevieb

Update: An hour after I wrote this post, I considered my first node :)

2012/04/13

Using a DVCS for your code and documents: Part 1

This document is the first in a series that outlines the very basics of using a Distributed Revision Control System (DVCS) to manage and store changes and updates to documents you write. Used primarily for software code, I've come to use it for my blog posts, poetry, documentation etc. If you've never heard of revision control systems before, you might want to do a quick search online for what they do, and what they are for.

Back in the day, I used software such as CVS and SVN to house my code changes. This was tedious though. I had to set up a server, keep it running 24x7, ensure it was properly backed-up and maintained. I worked at an ISP, so these things weren't a problem for me. However, for others, having a dedicated server doesn't make a lot of sense.

Recently, I decided to give Bitbucket a try. It, like GitHub, provides free hosting of your document repositories. The most interesting and useful feature of DVCS is that they are indeed distributed. If my CVS or SVN server went down, that was it... work stopped. With DVCS, I can clone my repository, and if the remote server goes down, I can use my current clone just as if it was the original repository. I can clone it again, or even use it to store new changes.

Go on over to Bitbucket and set yourself up an account. Once you're done that, navigate to where you can create a new repository. I'm going to name mine "Test" for this example. I am going to make it public, so that you can see my repository at the end of this post. I don't need a bug tracker, so I'm leaving that option unchecked, as well as the Wiki option. Although I've written patches against Git repositories, I like Mercurial, so I'm going to use that. (The command "hg" is for Mecurial, so you'll see it often in my examples.)

Once you've created the repository (hereinafter: repo), click the link that states "I'm starting from scratch".

Create a repo directory on your computer, and change into it:

$ cd ~
$ mkdir repos
$ cd repos

Now copy the 'clone' line that you see on the Bitbucket page, and paste it on your command line:

steve@ub:~/repos$ hg clone https://bitbucket.org/spek/test

Output:

destination directory: test
no changes found
updating to branch default
0 files updated, 0 files merged, 0 files removed, 0 files unresolved

We've cloned our new, empty repository. It created a new sub-directory called "test". Change into this new directory:

steve@ub:~/repos$ cd test

steve@ub:~/repos/test$ ls -la
total 12
drwxrwxr-x 3 steve steve 4096 2012-04-13 17:59 .
drwxrwxr-x 3 steve steve 4096 2012-04-13 17:59 ..
drwxrwxr-x 3 steve steve 4096 2012-04-13 17:59 .hg

The '.hg' directory is where the important information is stored.

Let's get right into using your repository. I will go through the basic commands as we encounter them.

Start by creating a new file, and adding some text to it. I use vim, but you can of course use any editor of your choosing.

steve@ub:~/repos/test$ vim test.pl

Save the file. Here's what my new file looks like:

#!/usr/bin/perl

use warnings;
use strict;

print "Hello, world!\n";

Ok, we have a new file with some text in it. Let's check the status of the file in relation to our repository. The command 'hg' is for Mercurial:

steve@ub:~/repos/test$ hg status
? test.pl

The "?" before the filename means that this file is unknown to the repository. There will be many cases where you won't want to add certain files to a repository, but we'll deal with that in a later post. For now, we want to add this file:

steve@ub:~/repos/test$ hg add test.pl 

Note that you can also call "hg add" with no filenames. This will include ALL files (recursively). Now let's re-check the status of our repository:

steve@ub:~/repos/test$ hg status
A test.pl

The "A" before the filename means that you have added a new file. It has not been committed to the repository yet. Let's do this now:

steve@ub:~/repos/test$ hg commit -m "-initial import"

Output:

abort: no username supplied (see "hg help config")

Whoops! What happened? Well, Mercurial (hg) needs to know authentication information before you send up changes back to your master repository. We'll discuss how to do this momentarily. First, lets focus on the "commit" command to hg. The "-m" flag tells hg that you want to add an inline message for this change. If you omit the -m and the following message, you will be dropped into your default editor to write one out there. You can cancel a commit simply by exiting your editor without saving. Now, back to adding auth information. While in your repository, create a new file named "hgrc", and add your information. Mine looks like this:

steve@ub:~/repos/test$ cat hgrc 
[paths]
default = https://spek@bitbucket.org/spek/test
[ui]
username = steveb <steveb@cpan.org>

The "default" directive under the [paths] category is the link to your repository on Bitbucket. Under the [ui] section, the "username" is the email address/account you signed up to Bitbucket with. I don't add anything further... I prefer to just type my password out manually when I need to. Once your 'hgrc' file is created, move it into the .hg directory:

steve@ub:~/repos/test$ mv hgrc .hg/

Now rerun your commit:

steve@ub:~/repos/test$ hg commit -m "-initial import"

Went off without a hitch. Committing saves your changes in a changeset in your local working copy. To push them to the master (in this case, Bitbucket), we use "push". Let's upload the local commits now:

steve@ub:~/repos/test$ hg push

Output:

pushing to https://spek@bitbucket.org/spek/test
searching for changes
http authorization required
realm: Bitbucket.org HTTP
user: spek
password: 
remote: adding changesets
remote: adding manifests
remote: adding file changes
remote: added 1 changesets with 1 changes to 1 files
remote: bb/acl: spek is allowed. accepted payload.

Done. We added a file with "hg add", committed the changes via "hg commit", and uploaded the single changeset with "hg push". There's a problem though. My program was supposed to say hello to the universe, not just the world! Edit the test.pl file to print "Hello, universe!\n"; instead of "Hello, world!\n";, and then save the changes.

Now commit this update ("hg commit"), this time without the '-m' flag so it opens your editor. Add the following line in the commit message, and then save:

- replaced world with universe in print statement

Oh, man! I wanted to insert a comment saying what the print line is doing, but I forgot. Edit test.pl so it looks like this:

#!/usr/bin/perl

use warnings;
use strict;

# say "hi" to the universe
print "Hello, universe!\n";

Let's check the status again:

steve@ub:~/repos/test$ hg status
M test.pl

The 'M' prior to the filename signifies that we have a Modified file that hasn't been committed yet. Do that now:

steve@ub:~/repos/test$ hg commit -m "- added comment for print universe"

We committed two changes (which created two changesets), but these changes are local only. Let's push them up to our master repository:

steve@ub:~/repos/test$ hg push
pushing to https://spek@bitbucket.org/spek/test
searching for changes
http authorization required
realm: Bitbucket.org HTTP
user: spek
password: 
remote: adding changesets
remote: adding manifests
remote: adding file changes
remote: added 2 changesets with 2 changes to 1 files
remote: bb/acl: spek is allowed. accepted payload.

Notice this time the output found two changesets. This is because we committed two changes prior to pushing the first one. A rule of thumb is "commit early, commit often". I follow the same rule with push.

So, we have our program created, and it runs great. We have made changes, and saved these changes. Let's see the basics on viewing the changes we've made. "hg log" shows you a list with a brief set of details for all the changesets you've committed. They appear in reverse chronological order. The hexidecimal string next to the "N:" in the "changeset:" line represents the specific changeset. This is much more complicated than how I'm describing it, so we'll focus on these details in a later post.

steve@ub:~/repos/test$ hg log

changeset:   2:52eec25c7d12
tag:         tip
user:        steveb@cpan.org
date:        Fri Apr 13 19:30:58 2012 -0400
summary:     - added comment for print universe

changeset:   1:b558f5695e13
user:        steveb@cpan.org
date:        Fri Apr 13 19:29:09 2012 -0400
summary:     - replaced world with universe in print statement

changeset:   0:739f47eadd48
user:        steveb@cpan.org
date:        Fri Apr 13 19:14:08 2012 -0400
summary:     -initial import

The log is great for history, but what if we need to see more information... such as the list of files changed, and all the lines in the commit message as opposed to just the first? Adding the "-v" flag to 'hg log' will show you the files changed, as well as all the lines you added to your commit message. Here's an example from one of my real repositories:

steve@ub:~/devel/repos/devel-trace-method$ hg log -v | more

changeset:   13:2ca86cf74c83
user:        steveb 
date:        Sat Mar 03 10:54:36 2012 -0500
files:       Changes Makefile.PL README lib/Devel/Trace/Method.pm
description:
- 0.08 POD cleanup, Makefile.PL fix
- added meta section to Makefile.PL allowing us to tell CPAN
  that we use a different tracker than rt.cpan.org
- cleaned up POD so that LICENSE would appear correctly on
  CPAN

Within each commit, we can now see what files we changed, and the list of comments we made per changeset. What if we need to see the actual changes themselves? No problem... add the "-p" (patch) flag to 'hg log':

steve@ub:~/repos/test$ hg log -p

...wait! That lists ALL of our changesets (commits). That's too much information for what we want. I want to know about the last commit only right now. In Mercurial, we are currently working in the "tip" branch. Other *revision control systems may refer to this as HEAD. Let's check out the actual changes like we tried above, but only the most recent change. Again, the '-p' flag means "patch". The '-r' flag means "revision". We want to see the actual physical changes (-p) to the most recent revision (-r):

steve@ub:~/repos/test$ hg log -p -r tip

changeset:   2:52eec25c7d12
tag:         tip
user:        steveb@cpan.org
date:        Fri Apr 13 19:30:58 2012 -0400
summary:     - added comment for print universe

diff -r b558f5695e13 -r 52eec25c7d12 test.pl
--- a/test.pl Fri Apr 13 19:29:09 2012 -0400
+++ b/test.pl Fri Apr 13 19:30:58 2012 -0400
@@ -3,4 +3,5 @@
 use warnings;
 use strict;
 
+# # say "hi" to the universe
 print "Hello, universe!\n";

What if I want to see the more verbose changeset information (all files and all comments). Can I? Youbetcha!:

steve@ub:~/repos/test$ hg log -p -v

Here are a couple "howtos" regarding the "hg log" command. You can add the verbose (-v) and patch (-p) flags to either of these:

# view the first commit
steve@ub:~/repos/test$ hg log -r 0

# review the 1st and 3rd commit
steve@ub:~/repos/test$ hg log -r 0 -r 2

# review the most recent commit
steve@ub@:~/repos/test$ hg log -r tip

This tutorial series is primarily designed to describe the command-line usage of a DVCS application. The web-based display of the storage facility is outside the scope of this document, but it can be very handy. Here is what my online repo looks like after completing the examples in this post.

I'll end this post here. You've learnt the very basics on how to clone a Mercurial repository from your free Bitbucket account, how to commit changes into changesets, how to push the changesets back into the master repository, and how to do some basic review of the changes that you've made. In the next episode, we'll delve into how you can perform more advanced reviews of your changes, revert your working directory to a previous change, creating branches to manage different change tracks and an explanation and examples of how DVCS differs from non-distributed versioning systems. We'll also touch on the ".hgignore" file, which allows you to use "hg add" without adding files you don't want included.

Thanks for reading. If you've read any of my other posts, you know I appreciate all feedback, good and/or bad in either the comments below, or privately via email.

2012/04/09

use Perl; Poetry: Method to my $madness

Back in the day,
I was unable to say(),
I had to print() everything to find the err of my ways,
but things changed,
and now I can return(),
to the main() program of life,
where nearly everything burns,
I'm not rapping this on a mic,
I'm just writing to be free,
before I have to split(),
and perform other tasks that bore me.

So I grab my map(),
and put two and two together,
hashing into keys the locations,
that I can remember,
but I've %seen these places,
now what do I do?,
I pick a rand()om place that none of us have ever been to,
but the rules are strict,
I keep getting warnings,
something about being literal when I'm trying to be corny,
but I'm out of the ord()inary,
I'm a chr()acter in reverse(),
I know what Larry wanted,
but my linguistics are the worst,
this is freeverse,
I DESTROY()ed the English language,
I just hope I get eval()uated to true,
before I die() and then get laughed at,
but it isn't like that,
all I seek() is some closure,
I'm not an array of elements trying to get sort()ed into order,
I'm random,
I try to please for() each(),
it's like a disease,
I've thrown away all the keys(),
I write in these confinements,
it's a very strict type of squeeze.

The warnings teach me lessons,
correct my spelling and my messes,
but if Larry's crew would finish Perl6,
to my $madness I'd have methods.

- stevieb 20120409

2012/04/08

use Perl; Guide to references: Part 5

This is the final part in my five part guide to Perl references. It's a complete program that contains a menu system along with the card game 'war'. This is pretty serious spaghetti code, so I will likely replace it as soon as I come up with something that uses all of the examples in this series but has a more logical flow and is easier on the eyes :)

  • Part 1 - The basics
  • Part 2 - References as subroutine parameters
  • Part 3 - Nested data structures
  • Part 4 - Code references
  • Part 5 - Concepts put to use (this document)

Please leave any corrections, criticisms, improvements, additions, questions and requests for further clarity in the comments section below, or in an email.

The following program code can be copy/pasted without all of the comments from my scripts repository.

#!/usr/bin/perl

use warnings;
use strict;
use 5.10.0;

# create a master dispatch table, using a reference to an
# external sub, and two inline subs

my %dispatch_table = (
                        play    => \&play_game,
                        hello   => sub { say "\nHello, world!\n"; },
                        'exit'  => sub { say "\nGoodbye!\n"; exit; },
                    );

# create an href to the dispatch table hash

my $dt_ref = \%dispatch_table;

# take a reference to the closure within the games_played() sub

my $games_played = games_played();

# loop over the menu until the user exits

while ( 1 ){

    system( "clear" );

    # get the dispatch table options by dereferencing the
    # dispatch table href

    my @options = keys %{ $dt_ref };

    say "Enter one of these options: " . join( ' ', @options );
    chomp ( my $command = <STDIN> );

    # exit if an illegal option was entered by the user

    exit if ! exists $dt_ref->{ $command };
    
    # otherwise, execute the sub the user selected

    $dt_ref->{ $command }->();

    # check to see if any games have been played through
    # the $games_played closure cref

    if ( $games_played->() ){
        say "You've played " . $games_played->() . " games.\n";
    }

    print "Please press ENTER...";
    <STDIN>;
}
sub play_game {

    # this is the main game sub, called through the dispatch
    # table

    system( "clear" );

    # create a deck of cards using a hash, and assign
    # a numeric value to the face value key

    my %deck;
    my $card_value = 14;

    for ( qw( A K Q J ), ( reverse 2..10 ) ){

        $deck{ $_ } = $card_value;
        $card_value--;
    }

    # a list of the card faces (without their numeric values)
 
    my @cards = keys %deck;

    print "Enter your name: ";
    chomp ( my $player = <STDIN> );

    print "Enter number of rounds (default: 5): ";
    chomp ( my $rounds = <STDIN> );
    $rounds = 5 if $rounds !~ /\d+/;

    # create a nested HoH for the players, using an href as
    # the top level

    my $players = { 
                    $player => {
                                score    => 0,
                                card     => undef,
                            },
                    npc      => {
                                score    => 0, 
                                card     => undef,
                            },
                    };

    my @player_names = keys %{ $players };

    for my $round ( 1 .. $rounds ){

        print "Round $round: ";

        for my $player ( @player_names ){

            # call deal(), passing in an aref of the cards array

            my $card = deal( \@cards );
            print "$player $card   ";

            # set the players current card in their card slot in the
            # players HoH

            $players->{ $player }{ card } = $card;
        }

        # call the compare_hands() sub by passing in an anonymous
        # hash (reference) inline in the call, with three parameters.
        # All three values are references

        compare_hands({ 
                        player_names => \@player_names,
                        players      => $players,
                        deck         => \%deck,
                     });

        print "\n";
    }

    print "\n";

    # loop over players, and get each of their final
    # scores out of the players HOH

    for my $player ( @player_names ){

        my $score = $players->{ $player }{ score };
        say "$player won $score rounds.";
    }

    print "\n";

    # update games played

    $games_played->( 1 );
}
sub deal {

    # take an aref of @cards, and return a random one

    my $deck_of_cards = shift; # aref
    return $deck_of_cards->[ rand @{ $deck_of_cards } ];
}
sub compare_hands {

    my $named_params = shift;
    
    # separate out the data from the named parameters
    # in the href we got passed in

    my $player_names    = $named_params->{ player_names };
    my $players         = $named_params->{ players };
    
    # we convert the last named param back into a hash
    # by dereferencing it

    my %deck            = %{ $named_params->{ deck } };

    my ( $player1, $player2 ) = @{ $player_names };

    # get each player's card

    my $p1_card = $players->{ $player1 }{ card };
    my $p2_card = $players->{ $player2 }{ card };

    # check the face of the card to the %deck hash to
    # retrieve the numerical value

    my $p1_card_val = $deck{ $p1_card };
    my $p2_card_val = $deck{ $p2_card };

    # nobody wins this round... its a tie

    return if $p1_card_val == $p2_card_val;

    if ( $p1_card_val > $p2_card_val ){
        # player 1 wins
        $players->{ $player1 }{ score }++;
    }
    else {
        # player 2 wins
        $players->{ $player2 }{ score }++;
    }
}    
sub games_played {
    
    # state data

    my $games_played = 0;

    # our games_played closure

    return sub {
                my $add = shift;
                $games_played += $add if $add;
                return $games_played;
               }
}

Thank you very much for reading. I have received a lot of great feedback on the series, both from people informing me they have learnt a great deal, and others with corrections and additions. I appreciate you all. I hope you have enjoyed my Guide to reference tutorials. Please feel free to provide me feedback so I may improve on my style for future posts.

Regards and thanks,

-stevieb

2012/04/07

use Perl; Guide to references: Part 4

This is part four in my five part series on Perl references. In this post, we will be discussing code references (coderef, or just cref), some of the benefits they provide, and some interesting use cases, including closures and dispatch tables. If you haven't already, you may want to review the other parts in the series:

  • Part 1 - The basics
  • Part 2 - References as subroutine parameters
  • Part 3 - Nested data structures
  • Part 4 - Code references (this document)
  • Part 5 - Concepts put to use

As with all of the other parts in the series, I request that you leave corrections, criticisms, improvements, additions, questions and requests for further clarity in the comments section below, or in an email.

CODE REFERENCES

A code reference in Perl is no different than any of the other references we've discussed in the previous episodes, but instead of pointing to a data variable, the ref points to a subroutine. You take a reference to a subroutine the same way you take a reference to anything else:

sub hello {
    say "Hello, world!";
}

my $cref = \&hello;

The & sigil represents a sub, and it is needed when we take the reference. As with taking the ref, using the ref is the same as before as well. We must use the -> deref operator to access the item the reference points to.

# use an aref
$aref->[ 0 ];

# use an href
$href->{ a };

# use a cref
$cref->();

We can also assign an anonymous sub to a cref in cases where we don't necessarily have to define the function with a name:

my $cref = sub { say "Hello, world!"; }

Now that we have that out of the way, lets move on to some practical and interesting uses for code references.

CLOSURES

The most common type of closure is a sub that returns a reference to an inner sub. They are often used in Object Oriented Programming (OOP) (which is outside the scope of this tutorial) to keep state data. State data is data that persists after the program has exited the scope in which the data was defined. I can explain it better with some code:

sub persist {

    my $count = 0;
    return sub { say $count++; }
}

my $count_cref = persist();

$count_cref->();
$count_cref->();
$count_cref->();

First, we define a subroutine named persist(). Inside that sub we define a lexical variable $count (a lexical variable is one that can not be seen outside the scope of the block it is declared in. In this case, nothing outside of persist() can see the $count variable). After defining $count, we create an anonymous sub that prints the result of $count, and then adds one to it. We then call persist(), assigning its return value to $count_cref. The return of persist is a reference to the anonymous subroutine.

Because $count_cref points to the inner anonymous sub returned from persist() and not to persist() itself, the $count variable is never reset, and the sub that $count_cref points to will always keep its own version of $count, incremented each time the anon sub is executed through the reference.

To show how the $count variable retains its value as long as $count_cref is alive, here is the output from the above code snip:

0
1
2

Closures aren't only handy for OOP. We can use the same persist() sub to create multiple counters.

sub persist {

    my $count = 0;
    return sub { $count++; }
}

my $count_a_cref = persist();
my $count_b_cref = persist();
my $count_c_cref = persist();

say "Count A: " . $count_a_cref->();
say "Count A: " . $count_a_cref->();
say "Count B: " . $count_b_cref->();
say "Count B: " . $count_b_cref->();
say "Count B: " . $count_b_cref->();
say "Count C: " . $count_c_cref->();

Output:

Count A: 0
Count A: 1
Count B: 0
Count B: 1
Count B: 2
Count C: 0

Calls to the individual cref do not affect the state variables of the other cref state variables.

Here's an example that shows a more practical case where closures with state variables could be useful. If you're thinking that globals would do the trick here, you're right; that isn't the point of this tutorial though ;). I'm sticking with simple here. In my fifth and final installment, we'll write something far more realistic that brings all aspects of the series together.

sub write_line {

    my $count = 0;
    return sub { return ++$count; }
}

# call the function twice, each time receiving
# a separate anonymous sub, along with separate
# state variables

my $steve_lines = write_line();
my $sarah_lines = write_line();

# steve writes two lines of code

my $steve_total;
$steve_total = $steve_lines->();
$steve_total = $steve_lines->();

# sarah writes one

my $sarah_total;
$sarah_total = $sarah_lines->();

say "Steve wrote $steve_total lines of code";
say "Sarah wrote $sarah_total lines of code";

Output:

Steve wrote 2 lines of code
Sarah wrote 1 lines of code

As an aside: In the preceeding example, I had to declare the sub prior to using it. To understand why and how to get around that, see my Purpose and practical use of Perl's named blocks post.

Closures aren't limited to being returned from outer subs though. Any function that can return an inner anonymous sub that can contain its own lexical data can be used to create a closure. Here's an example:

my %h;
for my $color ( qw(red green blue) ){
   $h{$color} = sub { say $color };
}

$h{ blue }->();
$h{ red }->();
$h{ green }->();

In that example, we iterate over three colours. For each colour, we set a hash key as the colour and set that key's value as an anonymous sub that when called, prints the colour. The following three lines execute the closures. Although we could have, we didn't use any lexical data to keep track of anything. Also note that this code auto generated a dispatch table, which we are going to learn about next.

DISPATCH TABLES

Dispatch tables are hashes who's key's values are references to subroutines. It is like a table of contents that allows you to execute code through the hash keys.

my %dt = (
            hello => sub { say "Hello, world!"; },
            add   => \&add,
        );

# call the functions

my $more = $dt{ add }->( 5, 5 );
$dt{ hello }->();

sub add {

    my $x = shift;
    my $y = shift;
    return ( $x + $y );
}

First we define the dispatch table hash. The first key has a value of an anonymous sub. The second key contains a cref that points to the add() sub. This shows how short, one-line type subs can be housed within the dispatch table. The add sub has been defined to take two parameters. When we call the add sub, you can see how we call the sub through the hash key, which executes the sub through the cref it contains as its value. We then insert the parameters as normal.

An example of the benefits of a dispatch table is a menu system, where a user must select from a range of options. You give the user a list of options to select from, and in your dispatch table, you name your keys as the options you provided the user with. Each option to the user is dropped directly into the key field of the hash, and the subsequent subroutine runs.

In the following example, we have three operations the user can perform where the hash value is a cref to an external sub. The fifth option, exit, is short and simple, so we create it as an anonymous sub within the table itself. A couple sanity checks to ensure the input is legal, and running the correct operation is as simple as putting the users input into the dispatch table.

Here is a fully working menu program based on the concept of a dispatch table. I've kept it as simple and as basic as possible for clarity.

#!/usr/bin/perl

use warnings;
use strict;
use 5.10.0;

my %dt = (
            add      => \&add,
            subtract => \&subtract,
            multiply => \&multiply,
            'exit'   => sub { say "\nGoodbye!\n"; exit; },
        );
    
while (1) {

    system( "clear" );
    
    print "Please enter either add, subtract, multiply or exit: ";
    chomp ( my $operation = <STDIN> );

    # exit if told to

    $dt{ $operation }->() if $operation eq 'exit';

    # exit if illegal param

    if ( ! exists $dt{ $operation } ){
        say "\nIllegal input... exiting\n";
        exit;
    }
    
    print "Type in your first number: ";
    chomp ( my $x = <STDIN> );

    print "Type in your second number: ";
    chomp ( my $y = <STDIN> );

    # run the command selected by the user

    my $result = $dt{ $operation }->( $x, $y );

    say "\nPerforming $operation on $x and $y = $result\n";

    print "\nPress ENTER to continue...\n";
    <STDIN>;
    
}
sub add {
    my ( $x, $y ) = @_;
    return $x + $y;
}
sub subtract {
    my ( $x, $y ) = @_;
    return $x - $y;
}
sub multiply {
    my ( $x, $y ) = @_;
    return $x * $y;
}

Hopefully that simplistic example was enough to at least give you an idea of what dispatch tables could be capable of.

That's it for this episode, thanks for reading. In my next and last post in the series, we'll bring everything together in a single program that utilizes most of the concepts of what we have learnt throughout.

Update: Thanks to maximum-solo for pointing out that I had limited my definition of closures to only a single use case, and for the example code that returns closures from a for loop.

Update: Thanks to Jay Scott for sending typographical and grammatical corrections, and for numerous logical code description fixes.

use Perl; Guide to references Part 3

This is Part 3 of my five part guide to references series. In Part 1 we learnt the basic syntax for using references, in Part 2 we saw how to use references in subroutine calls, and in this episode we'll focus solely on nested data structures.

  • Part 1 - The basics
  • Part 2 - References as subroutine parameters
  • Part 3 - Nested data structures (this document)
  • Part 4 - Code references
  • Part 5 - Concepts put to use

At this point, it is rather imperative that you have a firm grasp on both the concepts and the syntax for creating, dereferencing and otherwise using references. If you are unfamiliar with any of these, I recommend you see Part 1.

As with the other parts in the series, I request that you to leave corrections, criticisms, improvements, additions, questions and requests for further clarity in the comments section below, or in an email.

NESTED DATA STRUCTURES

The two most elementary complex data structures are an array of arrays (AoA) and a hash of hashes (HoH). An AoA is simply an array where each element contains a reference to another array. Here's an example based on some of the concepts we've already learnt:

my @a;
my @a_0 = ( 1, 2, 3 );
my @a_1 = ( 4, 5, 6 );
my @a_2 = ( 7, 8, 9 );

$a[0] = \@a_0;
$a[1] = \@a_1;
$a[2] = \@a_2;

Using Data::Dumper, we see the contents of @a as follows. (I've inserted the comments for clarity)

$VAR1 = [ # the top @a array
          [ # $a[0]
            1,
            2,
            3
          ],
          [ # $a[1]
            4,
            5,
            6
          ],
          [ # $a[2]
            7,
            8,
            9
          ]
        ];

AoAs are good for storing multiple lists of data where the items will always retain their order. To access individual elements of the nested arrays, we need the -> deref operator again:

my $x = $a[0]->[0]; # value is 1

Note the positioning. We access the first element of @a as normal, but since $a[0] is a reference to another array, we must dereference here. Again:

my $y = $a[2]->[2]; # value is 9

Still using the above AoA structure, here's how to loop over each aref within the array. Note in the nested for() loop we see the @{} dereference operators again to access the data that each aref points to:

my $x = 0;

for my $aref ( @a ){

    say "in top level of a, elem $x";
    $x++;

    my $y = 0;

    for my $aref_elem ( @{ $aref } ){

        say "in second level elem $y, elem is: $aref_elem";
        $y++;
    }
}

Output:

in top level of a, elem 0
in second level elem 0, elem is: 1
in second level elem 1, elem is: 2
in second level elem 2, elem is: 3
in top level of a, elem 1
in second level elem 0, elem is: 4
in second level elem 1, elem is: 5
in second level elem 2, elem is: 6
in top level of a, elem 2
in second level elem 0, elem is: 7
in second level elem 1, elem is: 8
in second level elem 2, elem is: 9

You can compare that output to the loop itself, and also to the Data::Dumper output above to get a better idea of the nested structure.

More interesting and (imho) far more useful than the AoA is the HoH. Here's where significant usefulness begins.

my %person; # top level hash container

my %clothes  = ( shirt => 'red', pants => 'black', );
my %schedule = ( work => '0800', home => '0500', sleep => '2300', );
my %skills   = ( programming => 'poor', social => 'good' );

$person{ clothes  } = \%clothes;
$person{ schedule } = \%schedule;
$person{ skills }   = \%skills;

The Dumper output for a HoH looks much more interesting and easy to follow than the AoA:

$VAR1 = { # %person

          'skills' => {
                        'programming' => 'poor',
                        'social' => 'good'
                      },
          'clothes' => {
                         'pants' => 'black',
                         'shirt' => 'red'
                       },
          'schedule' => {
                          'work' => '0800',
                          'home' => '0500',
                          'sleep' => '2300'
                        }
        };

Here are a few examples of how to use the data:

# get the person's shirt

my $shirt_colour = $person{ clothes }->{ shirt }; # red

# change the person's shirt

$person{ clothes }->{ shirt } = 'black';

# list the persons skills

say "Person has the following skills: ";

for my $skill ( keys %{ $person{ skills } } ){
    print "$skill ";
}
print "\n";

# list each skill with the ability to perform the skill

say "Person's ";

while ( my ( $skill, $ability ) = each %{ $person{ skills } } ){

    print "$skill is $ability\n";
}

When dealing with a simple HoH, the deref operator (->) is not required. Due to the fact that Perl knows that a hash can never directly contain another hash, it is not ambiguous to type $person{ clothes }{ shirt }; Perl can identify that the nested key is a reference to another hash. Where the -> is required, is when the top level of the structure is a reference itself:

# create hrefs to anonymous hash

my $inner_1 = { a => 1, b => 2 };
my $inner_2 = { z => 26, y => 25 };

# add hrefs to hash

my %h = ( ref_1 => $inner_1, ref_2 => $inner_2 );

# take a ref to the %h hash

my $href = \%h;

# because $href is now a reference itself, we MUST use the dereference operator

say $href->{ ref_1 }{ z }; # prints 26

What if you wanted to keep track of all the classes in a school, and for each class, keep a list of all the student names? A HoH isn't needed, because all we want are the student names. The student names don't need a value. In this case, we would use a hash of arrays, or HoA:

# define the classrooms

my @room_1 = qw( steve mike dawn megan );
my @room_2 = qw( chris alexa melissa dave );
my @room_3 = qw( brittany hakim francois );

# declare the school. we'll declare it as a scalar
# because we're going to use an anonymous hash

my $school; # will become an href

# add the classrooms to the school

$school->{ room1 } = \@room_1;
$school->{ room2 } = \@room_2;
$school->{ room3 } = \@room_3;

# who's in room 2?

for my $student ( @{ $school->{ room2 } } ){
    say $student;
}

# output:
chris
alexa
melissa
dave

Notice the use of the array deref operator @{} in the for line. Things are starting to look a little more complex. Because $school->{ room2 } contains a reference to an array, we must dereference the entire thing. That example of dereferencing an array within a hash is where I see the most difficulty for programmers who are just starting to grasp refs. It is the mis-understanding of what is actually happening here that leads programmers to make syntax errors that generate output such as the following:

Not dereferencing the array ref prior to printing it:

ARRAY(0x8fba97c) 

Not using -> to dereference the $school reference to access the anonymous hash it points to. When an error like the following appears, it is a loud warning that you forgot to dereference the scalar $school, and that there is no %school counterpart... indeed, $school points to an unnamed (anonymous) hash:

Global symbol "%school" requires explicit package name at ./hoa.pl line 29.
Execution of ./hoa.pl aborted due to compilation errors.

Forgetting to dereference the array ref prior to pushing a new value onto it

Type of arg 1 to push must be array (not hash element) at ./hoa.pl line 31, near "'jeremy';"

Let's go back to school. Class three just got a new student. Let's add him to the roster.

# with push

push @{ $school->{ room3 } }, 'jeremy'; 

# or directly to the element, if we already know its position

$school->{ room3 }[3] = 'jeremy';

Let's print out all the classes.

# get the keys by dereferencing $school

for my $room_name ( keys %{ $school } ){
    
    say "Students in $room_name: ";
    print "    ";

    # get each student name from each class by
    # dereferencing each class aref

    for my $student ( @{ $school->{ $room_name } } ){
        print "$student ";
    }
    print "\n";
}

Output:

Students in room3: 
    brittany hakim francois jeremy 
Students in room1: 
    steve mike dawn megan 
Students in room2: 
    chris alexa melissa dave 

Notice that the names from the room arrays are still in original order, but the classrooms are not. Arrays keep their elements in the order in which you assign them, hashes act in a random fashion. To ensure the rooms are listed in order in this case, we simply add sort() to the for() line:

for my $room_name ( sort keys %{ $school } ){

A side note on dereferencing nested structures. The following are equivalent:

my $x = $href->{ aref }->[0];
my $x = $href->{ aref }[0];

In other words, you only need to use the -> deref operator for the first reference encountered. Perl implicitly dereferences everything thereafter without the explicit ->. This is because everything underneath the first data structure is always a reference, and Perl knows this.

There is no limit to the depths and complexity you can conceive with these nested data structures thanks to references. Almost all objects in Object Oriented Programming in Perl use storage mechanisms just like this.

Thanks for reading part three of my series. In part four, we'll focus on subroutine references (coderef) and dispatch tables. Then we'll build a menu system using all of the concepts we've learnt that you can incorporate into your own programs. Once again, please leave feedback in comments, or send me an email.

use Perl; Guide to references Part 2

This is part two in my five part series on Perl references. In Part 1, we went through the basics; how to take references to items and access the items through their references. In this episode, we'll explain some of the differences and benefits of sending references into subroutines, as opposed to the list-type data variables themselves. It s divided up into three sub-sections: references as subroutine parameters, named parameters and anonymous data.

  • Part 1 - The basics
  • Part 2 - References as subroutine parameters (this document)
  • Part 3 - Nested data structures
  • Part 4 - Code references
  • Part 5 - Concepts put to use

This episode assumes that you have at least a minimal understanding of how subroutines (functions) work in Perl; both how to send data into a function, and the standard methods of accessing the data once the function has accepted it. As before, I urge you to leave corrections, criticisms, improvements, questions and requests for further clarity in the comments section below, or in an email.

From this point forward, I will often substitute certain terms with abbreviations: ref for reference, deref for dereference, aref for array reference, href for hash reference and sub or function for subroutine.

REFERENCES AS SUBROUTINE PARAMETERS

Let's start off this section with a sample piece of code:

my @a = ( 1, 2, 3 );
my %h = ( a => 10, b => 20, c => 30 );

hello( @a, %h );

sub hello {

    my @array = shift;
    my %hash  = shift;

    # do stuff
}

As it appears, you are calling the hello() function with two parameters; an array as parameter one, and a hash as parameter two. We then proceed to take the parameters and assign them accordingly. However, in Perl, this does not work as you may think. Perl doesn't keep the parameters as separate parts. Instead, it flattens all the parameters together into a single list. In the case above, if we printed the parameter list before we took anything from it, it would appear as one long list of individual items:

1 2 3 c 30 a 10 b 20 

So in the above code, @array would contain 1, while we would have forced 2 into %hash. The rest of the flattened parameters (that are essentially one long list of scalar values) remain unused.

Because refs are simple individual scalars that only point to a data structure, we can pass the ref in as opposed to the list of the data structure's contents.

my @a = ( 1, 2, 3 );
my %h = ( a => 10, b => 20, c => 30 );

my $aref = \@a;
my $href = \%h;

hello( $aref, $href );

sub hello {

    my $aref_param = shift;
    my $href_param = shift;
}

In the first example, we thought we were passing in two parameters, but perl took the values from our parameters and merged them into one long list. By passing refs, our sub receives only two parameters as intended, and we can easily differentiate our array data and our hash data. This is termed "passing by reference", and it is the most common method to pass parameters to a function when the function needs more than just a few scalar values. We can now work on the refs within the sub the same way we were doing in Part 1.

When passing by reference, any changes made to the data the ref points to will be permanently changed, even after the subroutine returns. Passing data into a sub directly (not via a ref) makes an internal *copy* of the data, and when the sub returns, the original data is not modified. If it is necessary to keep your original data intact, you can make a copy of the data by dereferencing it within the function, and returning either the copy, or a reference to the copy:

my @a = ( 1, 2, 3 );

my $aref = \@a;

my @b = hello( $aref );

say "Original array:";
for my $x ( @a ){
    print "$x ";
}

say "\nReturned copy:";
for my $y ( @b ){
    print "$y ";
}

sub hello {

    my $aref = shift;
    
    # make a copy of the referenced array
    my @array = @{ $aref };

    $array[ 0 ] = 99;

    return @array;
}

Output:

Original array:
1 2 3 
Returned copy:
99 2 3

Although we've now modified our code so that we can take data structures as a parameter via their refs, we're still using "positional" function arguments, meaning that the parameters must be sent into the function in a specified order. Here's a brief code snippet of a similar example:

sub goodbye {
    my $mandatory_param_aref = shift;
    my $optional_param_aref  = shift;
}

# call it like this

goodbye( $aref1, $aref2 );

Now, what happens if we want to modify the code to accept a second optional argument?

sub goodbye {
    my $mandatory_param_aref = shift;
    my $optional_param_aref  = shift;
    my $second_optional_aref = shift;
}

# call it like this

goodbye( $aref1, $aref2, $aref3 );

No problem. However, what happens if you don't want to use the first optional parameter? You can't just do this:

goodbye( $aref1, $aref3 );

Because the function would take $aref3 and shift it off as the first optional parameter causing potentially all kinds of grief. You could send in undef in the optional positions that you don't want to supply data for so that the second optional parameter is assigned appropriately to the correct variable within the function:

goodbye( $aref1, undef, $aref2 );

But how about in a case with five optional parameters where you only want to supply the third and fifth?

goodbye( $param1, undef, undef, $param4, undef, $param6 );

Not only is that unsightly, but it is potentially very unstable code. You can see that it wouldn't be hard to position those incorrectly. There is a solution though.

NAMED PARAMETERS USING HASH REFERENCES

my %data = (
            user => 'stevieb',
            year => 2012,            
        );

my $data_ref = \%data;

user_of_the_year( $data_ref );

sub user_of_the_year {
    my $p = shift;

    my $user = $p->{ user };
    my $year = $p->{ year };

    say "Our luser of $year is $user";
}

We created a hash with the data we want to send in to our function, then we take a reference to that hash. The hash reference is what we send into the function. Inside the function, we shift off the only parameter we received (the href), and proceed to extract the values and assign them to lexical variables through the ref using the deref operator ->.

A few things to note here. First, the positional problem is gone. The function will only ever accept a single parameter; the href. Also, if the function has optional parameters, there's no undef trickery to reposition the remaining parameters. Simply omit the named key in the hash.

In the above function definition, it isn't mandatory to dereference the hash and extract its values to scalars right away. The last line could just as easily have been written like this:

say "Our luser of $p->{ year } is $p->{ user }";

However, I personally opt to extract immediately, therefore I can very quickly see what the function expects the data to look like without having to wade through the function code. Extracting in one place also makes it very easy to visually verify that your POD function use statements are accurate.

ANONYMOUS DATA

Often it is the case that you need to make a data structure on the fly, but don't need to assign a temporary name to it. We can skip steps by using references.

Instead of this two step process:

my %h = ( a => 1, b => 2 );
my $href = \%h;

We can take a reference directly from an unnamed (anonymous) hash:

my $href = { a => 1, b => 2 };

So, to create an href to an anonymous hash, we surround the data within braces instead of parens. Note that the braces are also used to distinguish hash keys. Arrays are similar, but they use their element brackets instead:

my $aref = [ 1, 2, 3 ];

In the function example above, I created the hash, took a ref to the hash, and passed the ref into the function as a parameter. Using anonymous data, I can skip creating the hash and taking a ref to it by inserting the ref to the anonymous data right within the function call:

user_of_the_year( { user => 'stevieb', year => 2012 } );

Or for more complex function calls with named parameters, you can put it on multiple lines:

sub user_of_the_year ({
                        name    => 'stevieb',
                        year    => 2012,
                        score   => 199,
                        awards  => 3,
                    });

Thank you for reading. Again, if you have any improvements or questions, leave me comments or send me an email.