When regexp is not the best solution

Regular expression are really useful. Unfortunately, they are not always the best way of doing things. Particularly when transformations you want to make are easy.

I wanted to know how to get file extension from filename the fastest way possible. There is 3 natural way of doing this:

# regexp str.match(/[^.]*/); ext=&



File module


At first sight I believed that the regexp should be faster than the split because it could be many . in a filename. But in reality, most of time there is only one dot and I realized the split will be faster. But not the fastest way. There is a function dedicated to this work in the File module.

Here is the Benchmark ruby code:

#!/usr/bin/env ruby require ‘benchmark’ n=80000 tab=[ ‘/accounts/user.json’, ‘/accounts/user.xml’, ‘/user/titi/blog/toto.json’, ‘/user/titi/blog/toto.xml’ ]

puts “Get extname” Benchmark.bm do |x| x.report(“regexp:”) { n.times do str=tab[rand(4)]; str.match(/[^.]*/); ext=&; end } x.report(" split:“) { n.times do str=tab[rand(4)]; ext=str.split(‘.’)[-1] ; end } x.report(” File:") { n.times do str=tab[rand(4)]; ext=File.extname(str); end } end

And here is the result

Get extname
            user     system      total        real
regexp:  2.550000   0.020000   2.570000 (  2.693407)
 split:  1.080000   0.050000   1.130000 (  1.190408)
  File:  0.640000   0.030000   0.670000 (  0.717748)

Conclusion of this benchmark, dedicated function are better than your way of doing stuff (most of time).

file path without the extension.

#!/usr/bin/env ruby require ‘benchmark’ n=80000 tab=[ ‘/accounts/user.json’, ‘/accounts/user.xml’, ‘/user/titi/blog/toto.json’, ‘/user/titi/blog/toto.xml’ ]

puts “remove extension” Benchmark.bm do |x| x.report(" File:“) { n.times do str=tab[rand(4)]; path=File.expand_path(str,File.basename(str,File.extname(str))); end } x.report(”chomp:") { n.times do str=tab[rand(4)]; ext=File.extname(str); path=str.chomp(ext); end } end

and here is the result:

remove extension
          user     system      total        real
 File:  0.970000   0.060000   1.030000 (  1.081398)
chomp:  0.820000   0.040000   0.860000 (  0.947432)

Conclusion of the second benchmark. One simple function is better than three dedicated functions. No surprise, but it is good to know.