post 'good coders code, great reuse' to del.icio.us post 'good coders code, great reuse' to digg post 'good coders code, great reuse' to reddit subscribe to 'good coders code, great reuse' posts via feed
good coders code, great reuse

An API that isn't comprehensible isn't usable.

James Gosling

I am now on Twitter! Meet me on Twitter here (my nick is pkrumins.)
Or on Google Buzz and Facebook.

Programming 03 Mar 2010 09:20 am
1 Star2 Stars3 Stars4 Stars5 Stars (4 votes, average: 5 out of 5)
Loading ... Loading ...

Vim Plugins, surround.vimThis is the seventh post in the article series “Vim Plugins You Should Know About“. This time I am going to introduce you to a plugin called “ragtag.vim“. A month ago it was still known as “allml.vim” but now it has been renamed to ragtag.vim.

The best parts of RagTag are mappings for editing HTML tags. It has a mapping for quickly closing open HTML tags, a mapping for quickly turning the typed word into a pair of open/close HTML tags, several mappings for inserting HTML doctype, linking to CSS stylesheets, loading JavaScript <script src="...">...</script> and it includes mappings for wrapping the typed text in a pair of <?php ... ?> tags for PHP, or <% ... %> for ASP or eRuby, and {% .. %} for Django.

RagTag is written by Tim Pope. He’s the master of Vim plugin programming. I have already written about two of his plugins - surround.vim and repeat.vim and more articles about his plugins are coming!

Previous articles in the series.

Here are examples of using the RagTag plugin.

Quickly closing an open HTML tag.

Suppose you have typed <div> and you want to close it without typing the whole closing tag </div> yourself.

The quick way to do it with RagTag is to press CTRL+X /. This mapping automatically closes the last open HTML tag.

Extra tip: If you didn’t have this plugin you could quickly close the tag by typing </ and pressing CTRL+X CTRL+O. The default Vim mapping CTRL+X CTRL+O guesses the item that may come after the cursor. In case of an open HTML tag it’s the close tag.

Creating a pair of open/close HTML tags from a word.

Suppose you want to quickly create a pair of <div></div> and place the cursor between the two tags.

The quick way to do it with RagTag is to type div and press CTRL+X SPACE. This mapping takes the typed word and creates a pair of HTML tags, one closing tag and one open tag on the same line.

However, if you wish to create open/close tag pair separated by a newline, type CTRL+X ENTER.

Here is an example, if you just typed div and then press CTRL+X ENTER it will produce the following output:

<div>
|
</div>

| indicates the position of cursor.

Insert HTML doctype.

If you type CTRL+X ! RagTag will display a list of HTML doctypes to choose from. Defaults to HTML 4.01 Strict.

This mapping is not that useful, given that I already introduced snipmate.vim plugin for creating snippets. Using snipmate.vim you can create a snippet “h” that would insert the whole HTML structure, including doctype, html, body, title, meta tags, etc.

Link to a CSS stylesheet.

Typing CTRL+X @ inserts the snippet for linking to a CSS stylesheet.

<link rel="stylesheet" type="text/css" href="/stylesheets/|.css">

This is again not that useful, given that we have snipmate.vim.

The mapping is easy to remember because @ is used for importing in CSS.

Insert meta content-type tag.

Typing CTRL+X # inserts the HTML meta tag for document’s content type and encoding.

<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">

The charset depends on document’s charset. If it’s utf-8, the mapping will set the charset to utf-8 in the meta tag.

Load JavaScript document.

Typing CTRL+X $ links to a JavaScript file.

<script type="text/javascript" src="/javascripts/|.js">

The mapping is easy to remember because $ is a valid char in identifiers in many languages.

Wrap the typed text in PHP, Django, eRuby template tags.

There are several different mappings for wrapping text in template tags. It’s best to summarize them in the following table. The table assumes you had just typed “foo” and you are editing an eRuby document:

Mapping    Result
---------  -----------
CTRL+X =   foo<%= | %>
CTRL+X +   <%= foo| %>
CTRL+X -   foo<% | %>
CTRL+X _   <% foo| %>
CTRL+X ‘   foo<%# | %>
CTRL+X "   <%# foo| %>

What this table shows us is, for example, that if you have typed “foo” and press CTRL+X _, the plugin will wrap “foo” in <% %> tags and place the cursor after “foo”.

Summary of all the RagTag mappings.

CTRL+X /       Close the last open HTML tag
CTRL+X SPACE   Create open/close HTML tags from the typed word
CTRL+X CR      The same as CTRL+X SPACE but puts a newspace in between
CTRL+X !       Insert HTML doctype
CTRL+X @       Insert CSS stylesheet
CTRL+X #       Insert meta content-type meta tag
CTRL+X $       Load JavaScript document

For the following mappings, suppose that
you have typed "foo".

Mapping        Result
---------      -----------
CTRL+X =       foo<%= | %>
CTRL+X +       <%= foo| %>
CTRL+X -       foo<% | %>
CTRL+X _       <% foo| %>
CTRL+X ‘       foo<%# | %>
CTRL+X "       <%# foo| %>

How to install ragtag.vim?

To get the latest version:

  • 1. Download ragtag.zip.
  • 2. Extract ragtag.zip to ~/.vim (on Unix/Linux) or ~\vimfiles (on Windows).
  • 3. Run :helptags ~/.vim/doc (on Unix/Linux) or :helptags ~\vimfiles\doc (on Windows) to rebuild the tags file (so that you can read help :help ragtag.)
  • 4. Restart Vim or source ragtag.vim by typing :so ~/.vim/plugin/ragtag.vim (on Unix/Linux) or :so ~\vimfiles\plugin\ragtag.vim

Have Fun!

Have fun closing those HTML tags and until next time!

Comments (12) Comments | Email Post Email 'Vim Plugins You Should Know About, Part VII: ragtag.vim (formerly allml.vim)' to a friend | Print Post Print 'Vim Plugins You Should Know About, Part VII: ragtag.vim (formerly allml.vim)' | Permalink Permalink to 'Vim Plugins You Should Know About, Part VII: ragtag.vim (formerly allml.vim)' | Trackback Trackback to 'Vim Plugins You Should Know About, Part VII: ragtag.vim (formerly allml.vim)'
(Popularity: 6%) 9,647 Views

Did you like this page? Subscribe to my posts!

I am now on Twitter! Meet me on Twitter here (my nick is pkrumins.)
Or on Google Buzz and Facebook.

Programming 03 Feb 2010 12:10 pm
1 Star2 Stars3 Stars4 Stars5 Stars (2 votes, average: 5 out of 5)
Loading ... Loading ...

Perl One LinersThis is the fifth part of a nine-part article on famous Perl one-liners. In this part I will create various one-liners for text conversion and substitution. See part one for introduction of the series.

Famous Perl one-liners is my attempt to create “perl1line.txt” that is similar to “awk1line.txt” and “sed1line.txt” that have been so popular among Awk and Sed programmers.

The article on famous Perl one-liners will consist of nine parts:

Everyone who’s subscribed to my blog will get a free copy of Perl One-Liners e-book when I release it as part 9 of this article series!

Alright then, here are today’s one-liners:

Text conversion and substitution

62. ROT13 a string.

'y/A-Za-z/N-ZA-Mn-za-m/'

This one-liner uses the y operator (also known as tr operator) to do ROT13. Operators y and tr do string transliteration. Given y/SEARCH/REPLACE/, the operator transliterates all occurrences of the characters found in SEARCH list with the corresponding (position-wise) characters in REPLACE list.

In this one-liner A-Za-z creates the following list of characters:

ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz

And N-ZA-Mn-za-m creates this list:

NOPQRSTUVWXYZABCDEFGHIJKLMnopqrstuvwxyzabcdefghijklm

If you look closely you’ll notice that the second list is actually the first list offset by 13 characters. Now the y operator translates each character in the first list to a character in the second list, thus performing the ROT13 operation.

If you wish to ROT13 the whole file then do this:

perl -lpe 'y/A-Za-z/N-ZA-Mn-za-m/' file

The -p argument puts each of file’s line in the $_ variable, the y does ROT13, and -p prints the $_ out. The -l appends a newline to the output.

Note: remember that applying ROT13 twice produces the same string, i.e., ROT13(ROT13(string)) == string.

63. Base64 encode a string.

perl -MMIME::Base64 -e 'print encode_base64("string")'

This one-liner uses the MIME::Base64 module that is in the core (no need to install it, it comes with Perl). This module exports the encode_base64 function that takes a string and returns base64 encoded version of it.

To base64 encode the whole file do the following:

perl -MMIME::Base64 -0777 -ne 'print encode_base64($_)' file

Here the -0777 argument together with -n causes Perl to slurp the whole file into the $_ variable. Then the file gets base64 encoded and printed out, just like the string example above.

If we didn’t slurp the file and encoded it line-by-line we’d get a mess.

64. Base64 decode a string.

perl -MMIME::Base64 -le 'print decode_base64("base64string")'

The MIME::Base64 module also exports decode_base64 function that takes a base64-encoded string and decodes it.

The whole file can be similarly decoded by:

perl -MMIME::Base64 -ne 'print decode_base64($_)' file

There is no need to slurp the whole file into $_ because each line of a base64 encoded file is exactly 76 characters and decodes nicely.

65. URL-escape a string.

perl -MURI::Escape -le 'print uri_escape($string)'

You’ll need to install the URI::Escape module as it doesn’t come with Perl. The module exports two functions - uri_escape and uri_unescape. The first one does URL-escaping (sometimes also referred to as URL encoding), and the other does URL-unescaping (URL decoding).

66. URL-unescape a string.

perl -MURI::Escape -le 'print uri_unescape($string)'

This one-liner uses the uri_unescape function from URI::Escape module to do URL-unescaping.

67. HTML-encode a string.

perl -MHTML::Entities -le 'print encode_entities($string)'

This one-liner uses the encode_entities function from HTML::Entities module. This function encodes HTML entities. For example, < and > get turned into &lt; and &gt;.

68. HTML-decode a string.

perl -MHTML::Entities -le 'print decode_entities($string)'

This one-liner uses the decode_entities function from HTML::Entities module.

69. Convert all text to uppercase.

perl -nle 'print uc'

This one-liner uses the uc function, which by default operates on the $_ variable and returns an uppercase version of it.

Another way to do the same is to use -p command line option that enables automatic printing of $_ variable and modify it in-place:

perl -ple '$_=uc'

The same can also be also achieved by applying the \U escape sequence to string interpolation:

perl -nle 'print "\U$_"'

It causes anything after it (or until the first occurrence of \E) to be upper-cased.

70. Convert all text to lowercase.

perl -nle 'print lc'

This one-liner is very similar to the previous. Here the lc function is used that converts the contents of $_ to lowercase.

Or, using escape sequence \L and string interpolation:

perl -nle 'print "\L$_"'

Here \L causes everything after it (until the first occurrence of \E) to be lower-cased.

71. Uppercase only the first word of each line.

perl -nle 'print ucfirst lc'

The one-liner first applies the lc function to the input that makes it lower case and then uses the ucfirst function that upper-cases only the first character.

It can also be done via escape codes and string interpolation:

perl -nle 'print "\u\L$_"'

First the \L lower-cases the whole line, then \u upper-cases the first character.

72. Invert the letter case.

perl -ple 'y/A-Za-z/a-zA-Z/'

This one-liner does transliterates capital letters A-Z to lowercase letters a-z, and lowercase letters to uppercase letters, thus switching the case.

73. Camel case each line.

perl -ple 's/(\w+)/\u$1/g'

This is a lousy Camel Casing one-liner. It takes each word and upper-cases the first letter of it. It fails on possessive forms like “friend’s car”. It turns them into “Friend’S Car”.

An improvement is:

s/(?<!['])(\w+)/\u\1/g

Which checks if the character before the word is not single quote '. But I am sure it still fails on some more exotic examples.

74. Strip leading whitespace (spaces, tabs) from the beginning of each line.

perl -ple 's/^[ \t]+//'

This one-liner deletes all whitespace from the beginning of each line. It uses the substitution operator s. Given s/REGEX/REPLACE/ it replaces the matched REGEX by the REPLACE string. In this case the REGEX is ^[ \t]+, which means “match one or more space or tab at the beginning of the string” and REPLACE is nothing, meaning, replace the matched part with empty string.

The regex class [ \t] can actually be replaced by \s+ that matches any whitespace (including tabs and spaces):

perl -ple 's/^\s+//'

75. Strip trailing whitespace (space, tabs) from the end of each line.

perl -ple 's/[ \t]+$//'

This one-liner deletes all whitespace from the end of each line.

Here the REGEX of the s operator says “match one or more space or tab at the end of the string.” The REPLACE part is empty again, which means to erase the matched whitespace.

76. Strip whitespace from the beginning and end of each line.

perl -ple 's/^[ \t]+|[ \t]+$//g'

This one-liner combines the previous two. Notice that it specifies the global /g flag to the s operator. It’s necessary because we want it to delete whitespace at the beginning AND end of the string. If we didn’t specify it, it would only delete whitespace at the beginning (assuming it exists) and not at the end.

77. Convert UNIX newlines to DOS/Windows newlines.

perl -pe 's|\n|\r\n|'

This one-liner substitutes the Unix newline \n LF with Windows newline \r\n CRLF on each line. Remember that the s operator can use anything for delimiters. In this one-liner it uses vertical pipes to delimit REGEX from REPLACE to improve readibility.

78. Convert DOS/Windows newlines to UNIX newlines.

perl -pe 's|\r\n|\n|'

This one-liner does the opposite of the previous one. It takes Windows newlines CRLF and converts them to Unix newlines LF.

79. Convert UNIX newlines to Mac newlines.

perl -pe 's|\n|\r|'

Apple Macintoshes used to use \r CR as newlines. This one-liner converts UNIX’s \n to Mac’s \r.

80. Substitute (find and replace) “foo” with “bar” on each line.

perl -pe 's/foo/bar/'

This one-liner uses the s/REGEX/REPLACE/ command to substitute “foo” with “bar” on each line.

To replace all “foos” with “bars”, add the global /g flag:

perl -pe 's/foo/bar/g'

81. Substitute (find and replace) “foo” with “bar” on lines that match “baz”.

perl -pe '/baz/ && s/foo/bar/'

This one-liner is equivalent to:

while (defined($line = <>)) {
  if ($line =~ /baz/) {
    $line =~ s/foo/bar/
  }
}

It puts each line in variable $line, then checks if line matches “baz”, and if it does, it replaces “foo” with “bar” in it.

Got tired.

I got tired of writing one-liners at this moment. I’ll update this article the same way I did the 3rd part - I’ll add a few new one-liners each week until the article is finished.

Have Fun!

Have fun with these one-liners for now. The next part is going to be about selective printing and deleting of certain lines.

Can you think of other text conversion and substitution procedures that I did not include here?

Comments (20) Comments | Email Post Email 'Famous Perl One-Liners Explained, Part V: Text conversion and substitution' to a friend | Print Post Print 'Famous Perl One-Liners Explained, Part V: Text conversion and substitution' | Permalink Permalink to 'Famous Perl One-Liners Explained, Part V: Text conversion and substitution' | Trackback Trackback to 'Famous Perl One-Liners Explained, Part V: Text conversion and substitution'
(Popularity: 7%) 5,548 Views

Did you like this page? Subscribe to my posts!

I am now on Twitter! Meet me on Twitter here (my nick is pkrumins.)
Or on Google Buzz and Facebook.

Programming 18 Jan 2010 10:15 am
1 Star2 Stars3 Stars4 Stars5 Stars (8 votes, average: 3.5 out of 5)
Loading ... Loading ...

Vim Plugins, surround.vimThis is the sixth post in the article series “Vim Plugins You Should Know About“. This time I am going to introduce you to a vim plugin called “nerd_tree.vim“. It’s so useful that I can’t imagine working without it in vim.

Nerd Tree is a nifty plugin that allows you to explore the file system and open files and directories directly from vim. It opens the file system tree in a new vim window and you may use keyboard shortcuts and mouse to open files in new tabs, in new horizontal and vertical splits, quickly navigate between directories and create bookmarks for your most important projects.

This plugin was written by Marty Grenfell (also known as scrooloose).

Previous articles in the series:

Ps. Please help me reach 10,000 RSS subscribers. I am almost there. If you enjoy my posts and have not yet subscribed, subscribe here!

How to use nerd_tree.vim?

Nerd Tree plugin can be activated by the :NERDTree vim command. It will open in vim as a new vertical split on the left:

Vim Nerd Tree
A screenshot of Nerd Tree plugin in action.

Here are the basics of how to use the plugin:

  • Use the natural vim navigation keys hjkl to navigate the files.
  • Press o to open the file in a new buffer or open/close directory.
  • Press t to open the file in a new tab.
  • Press i to open the file in a new horizontal split.
  • Press s to open the file in a new vertical split.
  • Press p to go to parent directory.
  • Press r to refresh the current directory.

All other keyboard shortcuts can be found by pressing ?. It will open a special help screen with the shortcut listings. Press ? again to get back to file tree.

To close the plugin execute the :NERDTreeClose command.

Typing :NERDTree and :NERDTreeClose all the time is really inconvenient. Therefore I have mapped the toggle command :NERDTreeToggle to the F2 key. This way I can quickly open and close Nerd Tree whenever I wish. You can also map it to F2 by putting map <F2> :NERDTreeToggle<CR> in your .vimrc file.

How to install nerd_tree.vim?

To get the latest version:

  • 1. Download NERD_tree.zip.
  • 2. Extract NERD_tree.zip to ~/.vim (on Unix/Linux) or ~\vimfiles (on Windows).
  • 3. Run :helptags ~/.vim/doc (on Unix/Linux) or :helptags ~/vimfiles/doc (on Windows) to rebuild the tags file (so that you can read :help NERD_tree.)
  • 4. Restart Vim.

Have Fun!

Have fun exploring your files with this awesome plugin and until next time!

Comments (19) Comments | Email Post Email 'Vim Plugins You Should Know About, Part VI: nerd_tree.vim' to a friend | Print Post Print 'Vim Plugins You Should Know About, Part VI: nerd_tree.vim' | Permalink Permalink to 'Vim Plugins You Should Know About, Part VI: nerd_tree.vim' | Trackback Trackback to 'Vim Plugins You Should Know About, Part VI: nerd_tree.vim'
(Popularity: 12%) 18,635 Views

Did you like this page? Subscribe to my posts!

I am now on Twitter! Meet me on Twitter here (my nick is pkrumins.)
Or on Google Buzz and Facebook.

Programming 07 Jan 2010 08:00 am
1 Star2 Stars3 Stars4 Stars5 Stars (12 votes, average: 5 out of 5)
Loading ... Loading ...

Perl One LinersThis is the fourth part of a nine-part article on famous Perl one-liners. In this part I will create various one-liners for string and array creation. See part one for introduction of the series.

Famous Perl one-liners is my attempt to create “perl1line.txt” that is similar to “awk1line.txt” and “sed1line.txt” that have been so popular among Awk and Sed programmers.

The article on famous Perl one-liners will consist of nine parts:

  • Part I: File spacing.
  • Part II: Line numbering.
  • Part III: Calculations.
  • Part IV: String creation and array creation (this part).
  • Part V: Text conversion and substitution.
  • Part VI: Selective printing and deleting of certain lines.
  • Part VII: Handy regular expressions.
  • Part VIII: Release of perl1line.txt.
  • Part IX: Release of Perl One-Liners e-book.

I decided that there will be two new parts in this series. The most powerful feature in Perl is its regular expressions, therefore I will write a part on “Handy Perl regular expressions.” I also decided to publish an e-book after I am done with the series, so that will be the last part of this series. Everyone who’s subscribed to my blog will get a free copy! Subscribe to my blog now!

I also updated the previous part on calculations with 14 new one-liners on finding values of constants pi and e, doing date calculations, finding factorial, greatest common divisor, least common multiple, generating random numbers, generating permutations, finding power sets and doing some IP address conversions.

Here are today’s one-liners:

String Creation and Array Creation

49. Generate and print the alphabet.

perl -le 'print a..z'

This one-liner prints all the letters from a to z as abcdefghijklmnopqrstuvwxyz. The letters are generated by the range operator ... The range operator, when used in the list context (which is forced here by print) on strings, uses the magical auto-increment algorithm that advances the string to the next character. So in this one-liner the auto-increment algorithm on the range a..z produces all the letters from a to z.

I really golfed this one-liner. If you used strict it would not work because of barewords a and z. Semantically more correct version is this:

perl -le 'print ("a".."z")'

Remember that the range operator .. produced a list of values. If you wish, you may print them comma separated by setting the $, special variable:

perl -le '$, = ","; print ("a".."z")'

There are many more special variables. Take a look at my special variable cheat sheet for a complete listing.

Syntactically more appealing is to use join to separate the list with a comma:

perl -le 'print join ",", ("a".."z")'

Here the list a..z gets joined by a comma before printing.

50. Generate and print all the strings from “a” to “zz”.

perl -le 'print ("a".."zz")'

Here the range operator .. is used again. This time it does not stop at “z” as in the previous one-liner, but advances z by one-character producing “aa”, then it keeps going, producing “ab”, “ac”, …, until it hits “az”. At this point it advances the string to “ba”, continues with “bb”, “bc”, …, until it reaches “zz”.

Similarly, you may generate all strings from “aa” to “zz” by:

perl -le 'print "aa".."zz"'

Here it goes like “aa”, “ab”, …, “az”, “ba”, “bb”, …, “bz”, “ca”, … “zz”.

51. Create a hex lookup table.

@hex = (0..9, "a".."f")

Here the array @hex gets filled with values 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 and letters a, b, c, d, e, f.

You may use this array to convert a number (in variable $num) from decimal to hex using base conversion formula:

perl -le '$num = 255; @hex = (0..9, "a".."f"); while ($num) { $s = $hex[($num%16)&15].$s; $num = int $num/16 } print $s'

Surely, much easier way to convert a number to hex is just using the printf function (or sprintf function) with %x format specifier. (The example above just illustrates a use of a hex lookup table that we created by using the range operator.)

perl -le '$hex = sprintf("%x", 255); print $hex'

(See my Perl printf and sprintf format cheat sheet for all the format specifiers.)

To convert the number back from hex to dec use the hex function:

perl -le '$num = "ff"; print hex $num'

The hex function takes a hex string (beginning with or without “0x”) and converts it to decimal.

52. Generate a random 8 character password.

perl -le 'print map { ("a".."z")[rand 26] } 1..8'

Here the map function executes ("a".."z")[rand 26] code 8 times (because it iterates over the dummy range 1..8). In each iteration the code chooses a random letter from the alphabet. When map is done iterating, it returns the generated list of characters and print function prints it out by concatenating all the characters together.

If you also wish to include numbers in the password, add 0..9 to the list of characters to choose from and change 26 to 36 as there are 36 different characters to choose from:

perl -le 'print map { ("a".."z", 0..9)[rand 36] } 1..8'

If you need a longer password, change 1..8 to 1..20 to generate a 20 character long password.

53. Create a string of specific length.

perl -le 'print "a"x50'

Operator x is the repetition operator. This one-liner creates a string of 50 letters “a” and prints it.

If the repetition operator is used in list context, it creates a list (instead of scalar) with the given elements repeated:

perl -le '@list = (1,2)x20; print "@list"'

This one liner creates a list of twenty repetitions of (1, 2) (it looks like (1, 2, 1, 2, 1, 2, …)).

54. Create an array from a string.

@months = split ' ', "Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec"

Here the @months gets filled with values from the string containing month names. As each month name is separated by a space, the split function splits them and puts them in @months. This way $months[0] contains “Jan”, $months[1] contains “Feb”, …, and $months[11] contains “Dec”.

Another way to do the same is by using qw// operator:

@months = qw/Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec/

The qw// operator takes a space separated string and creates an array with each word being an array element.

55. Create a string from an array.

@stuff = ("hello", 0..9, "world"); $string = join '-', @stuff

Here the values in array @stuff get turned in a string $string that has them separated by a hyphen. Turning an array in a string was done by the join function that takes a separator and a list, and concatenates the items in the list in a single string, separated by the separator.

56. Find the numeric values for characters in the string.

perl -le 'print join ", ", map { ord } split //, "hello world"'

This one-liner takes the string “hello world”, splits it into a list of characters by split //, "hello world", then it maps the ord function onto each of the characters, which returns the numeric, native 8-bit encoding (like ASCII or EBCDIC) of the character. Finally all the numeric values get joined together by a comma and get printed out.

Another way to do the same is use the unpack function and specify C* as the unpacking template (C means unsigned character and * means as many characters there are):

perl -le 'print join ", ", unpack("C*", "hello world")'

57. Convert a list of numeric ASCII values into a string.

perl -le '@ascii = (99, 111, 100, 105, 110, 103); print pack("C*", @ascii)'

Just as we unpacked a string into a list of values with the C* template in the one-liner above, we can pack them back into a string.

Another way to do the same is use the chr function that takes the code point value and returns the corresponding character:

perl -le '@ascii = (99, 111, 100, 105, 110, 103); print map { chr } @ascii'

Similar to one-liner #55 above, function chr gets mapped onto each value in the @ascii producing the characters.

58. Generate an array with odd numbers from 1 to 100.

perl -le '@odd = grep {$_ % 2 == 1} 1..100; print "@odd"'

This one-liner generates an array of odd numbers from 1 to 99 (as 1, 3, 5, 7, 9, 11, …, 99). It uses the grep function that evaluates the given code $_ % 2 == 1 for each element in the given list 1..100 and returns only the elements that had the code evaluate to true. In this case the code tests if the reminder of the number is 1. If it is, the number is odd and it has to be put in the @odd array.

Another way to write is by remembering that odd numbers have the low-bit set and testing this fact:

perl -le '@odd = grep { $_ & 1 } 1..100; print "@odd"'

Expression $_ & 1 isolates the low-bit, and grep selects only the numbers with low-bit set (odd numbers).

See my explanation of bit-hacks for full explanation and other related bit-hacks.

59. Generate an array with even numbers from 1 to 100.

perl -le '@even = grep {$_ % 2 == 0} 1..100; print "@even"'

This is almost the same as the previous one-liner, except the condition grep tests for is “is the number even (reminder dividing by 2 is zero)?”

60. Find the length of the string.

perl -le 'print length "one-liners are great"'

Just for completeness, the length subroutine finds the length of the string.

61. Find the number of elements in an array.

perl -le '@array = ("a".."z"); print scalar @array'

Evaluating an array in a scalar context returns the number of elements in it.

Another way to do the same is by adding one to the last index of the array:

perl -le '@array = ("a".."z"); print $#array + 1'

Here $#array returns the last index in array @array. Since it’s a number one less than the number of elements, we add 1 to the result to find the total number of elements in the array.

Have Fun!

Have fun with these one-liners for now. The next part is going to be about text conversion and substitution.

Can you think of other string creating and array creation one-liners that I didn’t include here?

Comments (20) Comments | Email Post Email 'Famous Perl One-Liners Explained, Part IV: String and Array Creation' to a friend | Print Post Print 'Famous Perl One-Liners Explained, Part IV: String and Array Creation' | Permalink Permalink to 'Famous Perl One-Liners Explained, Part IV: String and Array Creation' | Trackback Trackback to 'Famous Perl One-Liners Explained, Part IV: String and Array Creation'
(Popularity: 8%) 11,248 Views

Did you like this page? Subscribe to my posts!

I am now on Twitter! Meet me on Twitter here (my nick is pkrumins.)
Or on Google Buzz and Facebook.

Programming 14 Dec 2009 08:05 am
1 Star2 Stars3 Stars4 Stars5 Stars (10 votes, average: 4.1 out of 5)
Loading ... Loading ...
Yo dawg, I heard you liked regular expressions, so I put a regex in your regex so you can match while you match!

The regular expressions we use in our daily lives are actually not that “regular.” Most of the languages support some kind of extended regular expressions that are computationally more powerful than the “regular” regular expressions as defined by the formal language theory.

For instance, the so often used capture buffers add auxiliary storage to the regular expressions that allow them to match an arbitrary pattern repeatedly. Or look-ahead assertions that allow the regular expression engine to peek ahead before it making a decision. These extensions make regular expressions powerful enough to describe some context-free grammars.

The Perl programming language has an especially rich with regex engine. One of the engine’s features is the lazy regular subexpressions. The lazy regular subexpressions are expressed as (??{ code }), where the “code” is arbitrary Perl code that gets executed when the moment this subexpression may match.

This allows us to construct something really interesting - we can define a regular expression that has itself in the “code” part. The result is a recursive regular expression!

One of the classical problems that a regular expression can’t match is the language 0n1n, i.e., a string with a number of zeroes followed by an equal number of ones. Surprisingly, using the lazy regular subexpressions this problem becomes tractable!

Here is a Perl regular expression that matches 0n1n:

$regex = qr/0(??{$regex})?1/;

This regular expression matches a 0 followed by itself zero or one time, followed by a one. If the itself part doesn’t match, then the string this regular expression matches is 01. If the itself part matches, the string this regular expression matches is 00($regex)?11, which is 0011 if $regex doesn’t match or it’s 000($regex)?111 if it matches, …, etc.

Here is a Perl program that matches 050000150000:

#!/usr/bin/perl

$str = "0"x50000 . "1"x50000;
$regex = qr/0(??{$regex})*1/;

if ($str =~ /^$regex$/) {
  print "yes, it matches"
}
else {
  print "no, it doesn't match"
}

Now let’s look at the Yo Dawg regular expression in the picture above. Can you guess what it does? It matches a fully parenthesized expression such as (foo(bar())baz) or balanced parentheses ((()()())()).

$regex = qr/
  \(                 # (1) match an open paren (
    (                # followed by
      [^()]+         #   (3) one or more non-paren character
    |                # OR
      (??{$regex})   #   (5) the regex itself
    )*               # (6) repeated zero or more times
  \)                 # (7) followed by a close paren )
/x;

Here is how to think about this regular expression. For an expression to be fully parenthesized, it has to start with an open paren, so we match it (point (1) in the regex). It also has to end with close paren, so we match a close paren at the end (point (7)). Now we have to think what can be in-between the parens? Well, we can either have some text that is neither an open paren or closed paren (point (3)) OR we can have another fully parenthesized expression! (point (5)). And all this may be repeated either zero times (point (6)) to match the smallest fully parenthesized expression () or more times to match a more complex expression.

Without the /x flag (that allows multiline regexes), it can be written more compactly:

$regex = qr/\(([^()]+|(??{$regex}))*\)/;

But please don’t use these regular expressions in production as they are too cryptic. Use Text::Balanced or Regexp::Common Perl modules.

And finally, in Perl 5.10 you can use recursive capture buffers instead of lazy code subexpressions to achieve the same result.

Here is a regular expression that matches 0n1n and uses the recursive capture buffer syntax (?N):

my $rx = qr/(0(?1)*1)/

The (?1)* says “match the first group zero or more times,” where the first group is the whole regular expression.

You can try to rewrite the regular expression that matches balanced parens as an exercise.

Have fun!

Comments (21) Comments | Email Post Email 'Recursive Regular Expressions' to a friend | Print Post Print 'Recursive Regular Expressions' | Permalink Permalink to 'Recursive Regular Expressions' | Trackback Trackback to 'Recursive Regular Expressions'
(Popularity: 14%) 20,699 Views

Did you like this page? Subscribe to my posts!

Page 1 of 1312345»...Last »