Good notes posted to Ruby

RSS feed
April 23, 2009
6 thanks

Argument Ordering

Be aware that the order of arguments for this method is the opposite of File.join:

File.expand_path('foo', '/bar')   # => "/bar/foo"
File.join('foo', '/bar')          # => "foo/bar"
April 23, 2009
5 thanks

Handy shorthand for array manipulation

You may write something like this:

>> ['a', 'b', 'c'].collect{|letter| letter.capitalize}
=> ["A", "B", "C"]

But it looks so much nicer this way:

>> ['a', 'b', 'c'].collect(&:capitalize)
=> ["A", "B", "C"]
April 21, 2009
3 thanks

To throw an exception, use Kernel#raise

Other languages use the term throw for raising exceptions, but Ruby has a specific raise call for that.

April 16, 2009
15 thanks

Parameters for Hash#inject

When running inject on a Hash, the hash is first converted to an array before being passed through.

The typical Enumerable#inject approach would be to simply capture the value:

array.inject(...) do |c, v|
end

In the case of a Hash, v is actually a key/value pair Array. That is the key is v.first and the value is v.last, however using the pair this way is awkward and can lead to confusion.

Better to simply expand the parameters in the block definition:

hash.inject(...) do |c, (k, v)|
end

Where c is the traditional carry variable and k/v represent key and value respectively.

March 31, 2009
3 thanks

Sorting Hashes with Symbol Keys

To sort a hash with symbol keys, use Enumerable#sort_by:

h = { :a => 20, :b => 30, :c => 10  }
h.sort                       # => NoMethodError: undefined method `<=>' for :a:Symbol
h.sort_by { |k,v| k.to_s }   # => [[:a, 20], [:b, 30], [:c, 10]]
March 27, 2009
4 thanks

Hour with/without preceding zero

One gotcha is the difference between the hour in 12 hour time with and without a preceding zero. In some fonts they look the same.

With preceding zero (capital I)

Time.now.strftime("%I:%M") # => 05:21

Without preceding zero (lowercase L)

Time.now.strftime("%l:%M") # => 5:21
March 18, 2009
4 thanks

Better autopad numbers

There is a much better way than to use diwadn’s method if you want to pad numbers with zeros. Here’s my recommended way to do it:

"Number: %010d" % 12345 #=> "Number: 0000012345"

It’s very easy. First we begin our placeholder with “%”, then we specify a zero (0) to signify padding with zeros. If we omitted this zero, the number would be padded with spaces instead. When we have done that, just specify the target length of the string. At last a single “d” is placed to signify that we are inserting a number.

Please see String#% and Kernel#sprintf for more information about how to do this.

Here’s another example of how to do it:

12345.to_s.rjust(10, "0") #=> "0000012345"

See String#rjust for more information.

Any of these methods are a lot better than the method outlined below.

March 5, 2009
7 thanks

String#match will match single token only

>> s = “{{person}} ate {{thing}}”

> “{{person}} ate {{thing}}”

>> r = /{{(.*?)}}/

> {{}}

>> s.match®.captures

> [“person”]

Using String#scan pulls out all tokens you were searching for:

>> s.scan®.flatten

> [“person”, “thing”]

March 3, 2009
9 thanks

File class documentation

Most of the File class documentation is located in IO class docs. What you see here is what ‘ftools’ gives you.

February 16, 2009
6 thanks

Usage example

Some examples:

# Remove even numbers
(1..30).reject { |n| n % 2 == 0 }
# => [1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29]

# Remove years dividable with 4 (this is *not* the full leap years rule)
(1950..2000).reject { |y| y % 4 != 0 }
# => [1952, 1956, 1960, 1964, 1968, 1972, 1976, 1980, 1984, 1988, 1992, 1996, 2000]

# Remove users with karma below arithmetic mean
total = users.inject(0) { |total, user| total += user.karma }
mean = total / users.size
good_users = users.reject { |u| u.karma < mean }
February 12, 2009
4 thanks

Real life use

If you’re wondering what the base64 format is used for, here are some examples:

Just note that the encoded (character) data is about 30% larger than un-encoded (binary) data.

February 12, 2009
4 thanks

Binary files

Another real important flag is b when dealing with binary files. For example to download an mp3 from the internet you need to pass the b flag or the data will be screwed up:

# Downloads a binary file from the internet
require 'open-uri'
url = "http://fubar/song.mp3"
open(url, 'rb') do |mp3|
  File.open("local.mp3", 'wb') do |file|
    file.write(mp3.read)
  end
end

Don’t say you haven’t been warned. :)

February 12, 2009
3 thanks

Other regular-expression modifiers

Likewise you can set Regexp::IGNORECASE directly on the regexp with the literal syntax:

/first/i
# This will match "first", "First" and even "fiRSt"

Even more modifiers

  • o – Perform #{} interpolations only once, the first time the regexp literal is evaluated.

  • x – Ignores whitespace and allows comments in * regular expressions

  • u, e, s, n – Interpret the regexp as Unicode (UTF-8), EUC, SJIS, or ASCII. If none of these modifiers is specified, the regular expression is assumed to use the source encoding.

Literal to the rescue

Like string literals delimited with %Q, Ruby allows you to begin your regular expressions with %r followed by a delimiter of your choice.

This is useful when the pattern you are describing contains a lot of forward slash characters that you don’t want to escape:

%Q(http://)
# This will match "http://"
February 12, 2009
4 thanks

Literal syntax

As you propably know you can create an Array either with the constructor or the literal syntax:

Array.new == []
# => true

But there is also another nice and concise literal syntax for creating Arrays of Strings:

["one", "two", "three"] == %w[one two three]
# => true

You can use any kind of parenthesis you like after the %w, either (), [] or {}. I prefer the square brackets because it looks more like an array.

February 12, 2009
4 thanks

Useful scenario

This can be quite useful, for example when writing a command line script which takes a number of options.

Example

Let’s say you want to make a script that can make the basic CRUD operations. So want to be able to call it like this from the command line:

> my_script create
> my_script delete

The following script allows you to use any abbreviated command as long as it is unambiguous.

# my_script.rb
require 'abbrev'

command = ARGV.first
actions = %w[create read update delete]
mappings = Abbrev::abbrev(actions)
puts mappings[command]

That means you can call it like this:

> my_script cr
> my_script d

And it will print:

create
delete
February 10, 2009
3 thanks

Cheat Sheet

I have written a short introduction and a colorful cheat sheet for Perl Compatible Regular Expressions (PCRE) as used by Ruby’s Regexp class:

http://www.bitcetera.com/en/techblog/2008/04/01/regex-in-a-nutshell/

February 4, 2009
3 thanks

Multiline regexps

A shortcut for multiline regular expressions is

/First line.*Other line/m

(notice the trailing /m)

For example:

text = <<-END
  Hello world!
  This is a test.
END

text.match(/world.*test/m).nil?  #=> false
text.match(/world.*test/).nil?   #=> true
January 15, 2009
4 thanks

Convert String to Class in Rails

ncancelliere gave us a very useful tip below, and I just want to make an addendum for it:

If you are using Rails, there is a CoreExtension for this called String#constantize.

"Foo::BarKeeper".constantize #=> Foo::BarKeeper

You can use it with String#camelize if you have to convert the name too

"foo/bar_keeper".camelize             #=> "Foo::BarKeeper"
"foo/bar_keeper".camelize.constantize #=> Foo::BarKeeper

Don’t forget to rescue NameError in case there was an invalid class name. :-)

January 14, 2009
6 thanks

File open permissions

Usage: File.open path, flags, [permissions]

Flags (bitmasks)

Access:

File::RDONLY

Read-only

File::WRONLY

Write-only

File::RDWR

Read and write

If the file exists:

File::TRUNC

Truncate

File::APPEND

Append

File::EXCL

Fail

If the file doesn’t exist:

File::CREAT

Create

Flags (strings)

r

File::RDONLY

r+

File::RDWR

w

File::WRONLY|File::TRUNC|File::CREAT

a

File::WRONLY|File::APPEND|File::CREAT

Examples

File.open path, File::RDONLY
File.open path, 'w'
File.open path, File::WRONLY|File::TRUNC|File::CREAT
File.open path, File::WRONLY|File::TRUNC|File::CREAT, '0666'
December 6, 2008
5 thanks

Array expansion in blocks

The syntax can be improved as changing the second parameter of the block (values) and using an array of two variables instead, which will be used by Ruby as the key and value of “array”.

array = [['A', 'a'], ['B', 'b'], ['C', 'c']]

hash = array.inject({}) do |memo, (key, value)|
  memo[key] = value
  memo
end

hash
# => {'A' => 'a', 'B' => 'b', 'C' => 'c'}
December 2, 2008
5 thanks

From the official docs

enum.inject(initial) {| memo, obj | block } => obj enum.inject {| memo, obj | block } => obj

Combines the elements of enum by applying the block to an accumulator value (memo) and each element in turn. At each step, memo is set to the value returned by the block. The first form lets you supply an initial value for memo. The second form uses the first element of the collection as a the initial value (and skips that element while iterating).

# Sum some numbers
(5..10).inject {|sum, n| sum + n }              #=> 45
# Multiply some numbers
(5..10).inject(1) {|product, n| product * n }   #=> 151200

# find the longest word
longest = %w{ cat sheep bear }.inject do |memo,word|
   memo.length > word.length ? memo : word
end
longest                                         #=> "sheep"

# find the length of the longest word
longest = %w{ cat sheep bear }.inject(0) do |memo,word|
   memo >= word.length ? memo : word.length
end
longest                                         #=> 5

http://www.ruby-doc.org/core/classes/Enumerable.html

November 19, 2008
9 thanks

Formatting options

Readable strftime

%a - The abbreviated weekday name (“Sun”)

%A - The full weekday name (“Sunday”)

%b - The abbreviated month name (“Jan”)

%B - The full month name (“January”)

%c - The preferred local date and time representation

%d - Day of the month (01..31)

%H - Hour of the day, 24-hour clock (00..23)

%I - Hour of the day, 12-hour clock (01..12)

%j - Day of the year (001..366)

%m - Month of the year (01..12)

%M - Minute of the hour (00..59)

%p - Meridian indicator (“AM” or “PM”)

%S - Second of the minute (00..60)

%U - Week number of the current year, starting with the first Sunday as the first day of the first week (00..53)

%W - Week number of the current year, starting with the first Monday as the first day of the first week (00..53)

%w - Day of the week (Sunday is 0, 0..6)

%x - Preferred representation for the date alone, no time

%X - Preferred representation for the time alone, no date

%y - Year without a century (00..99)

%Y - Year with century

%Z - Time zone name %% - Literal “%’’ character t = Time.now t.strftime(“Printed on %m/%d/%Y”) #=> “Printed on 04/09/2003” t.strftime(“at %I:%M%p”) #=> “at 08:56AM”

November 18, 2008
9 thanks

Pop for last, Shift for first

If you want to pop the first element instead of the last one, use shift .

October 9, 2008
6 thanks

Works with URLs too!

You can use it for web urls as well:

path, file = File.split('/uploads/art/2869-speaking-of-pic.jpg')
p path # => "/uploads/art"
p file # => "2869-speaking-of-pic.jpg"

And you can also use join, to merge url back from the components:

path = File.join(["/uploads/art", "2869-speaking-of-pic.jpg"])
p path # => "/uploads/art/2869-speaking-of-pic.jpg"

Using #join and #split for operations on files and path parts of the URLs is generally better than simply joining/splitting strings by ‘/’ symbol. Mostly because of normalization:

File.split('//tmp///someimage.jpg') # => ["/tmp", "someimage.jpg"]
'//tmp///someimage.jpg'.split('/') # => ["", "", "tmp", "", "", "someimage.jpg"]

Same thing happens with join.

September 12, 2008
24 thanks

Readable strftime

%a - The abbreviated weekday name (“Sun”)

%A - The full weekday name (“Sunday”)

%b - The abbreviated month name (“Jan”)

%B - The full month name (“January”)

%c - The preferred local date and time representation

%d - Day of the month (01..31) %H - Hour of the day, 24-hour clock (00..23)

%I - Hour of the day, 12-hour clock (01..12)

%j - Day of the year (001..366)

%m - Month of the year (01..12) %M - Minute of the hour (00..59)

%p - Meridian indicator (“AM” or “PM”)

%S - Second of the minute (00..60)

%U - Week number of the current year, starting with the first Sunday as the first day of the first week (00..53)

%W - Week number of the current year, starting with the first Monday as the first day of the first week (00..53)

%w - Day of the week (Sunday is 0, 0..6)

%x - Preferred representation for the date alone, no time

%X - Preferred representation for the time alone, no date

%y - Year without a century (00..99) %Y - Year with century

%Z - Time zone name %% - Literal “%” character t = Time.now t.strftime(“Printed on %m/%d/%Y”) #=> “Printed on 04/09/2003” t.strftime(“at %I:%M%p”) #=> “at 08:56AM”

September 5, 2008
3 thanks

Require 'strscan'

To use the StringScanner class,

require 'strscan'
August 27, 2008 - (v1_8_6_287)
3 thanks

Example of raising a custom exception

Create custom exception<br> <pre> class PersonalException < Exception end </pre>

Raise the exception<br> <pre> raise PersonalException.new, “message” </pre>

August 23, 2008
4 thanks

Needs requiring 'enumerator' to work

This method needs that you

require 'enumerator'

for this method to be available.

August 17, 2008
6 thanks

Re: Convert an Array of Arrays to a Hash using inject

If you’re sure you have a two-level array (no other arrays inside the pairs) and exactly two items in each pair, then it’s faster and shorter to use this:

array = [['A', 'a'], ['B', 'b'], ['C', 'c']]
hash = Hash[*array.flatten]

For more than two-level deep arrays this will give the wrong result or even an error (for some inputs).

array = [['A', 'a'], ['B', 'b'], ['C', ['a', 'b', 'c']]]
hash = Hash[*array.flatten]
# => {"A"=>"a", "B"=>"b", "C"=>"a", "b"=>"c"}

But if you’re running Ruby 1.8.7 or greater you can pass an argument to Array#flatten and have it flatten only one level deep:

# on Ruby 1.8.7+
hash = Hash[*array.flatten(1)]
# => {"A"=>"a", "B"=>"b", "C"=>["a", "b", "c"]}