Monday, May 12, 2008

Parsing YAML files in Ruby

I've been messing around with rake lately, hoping to find a build tool that is fairly well documented, well-supported, and doesn't make me dig through a pile of angle-brackets.  At work, we store all our third-party assemblies on a centralized share.  So one of the first tasks in any build file is to copy third-party assemblies locally before you can compile.  It makes sense to me to specify reference assemblies in a human-readable data file and then read them in at runtime using a custom-written class for the occasion.  But don't forget: I'm on a mission to vanquish XML from my build process at all cost.  A fool's errand perhaps, but my errand none the less.

Enter YAML.  Simple and elegant, YAML dispenses with lots of overhead and requires just enough structure to make it perfect for config files.  Here's an example:

   1: ---
   2: shared paths:
   3:   lib share   : \\MySharedServer\lib
   4:   vendor      : \vendor
   5:  
   6: local paths:
   7:   references  : \references
   8:  
   9: custom assemblies:
  10:   - location: \Dev\Components\Business\Core\Trunk\Latest\Debug
  11:     assemblies:
  12:       - name: MyNamespace.Core
  13:         files:
  14:           - output    : MyNamespace.Core.dll
  15:           - debug     : MyNamespace.Core.pdb
  16:           - document  : MyNamespace.Core.xml
  17:       
  18:   - location: \Dev\Components\Framework\Trunk\Debug
  19:     assemblies:
  20:       - name: MyNamespace.Framework.Core
  21:         files:
  22:           - output    : MyNamespace.Framework.Core.dll
  23:           - debug     : MyNamespace.Framework.Core.pdb
  24:           - document  : MyNamespace.Framework.Core.xml
  25:  
  26: vendor assemblies:
  27:   - location: \DotNet Commons\Logging\2.0
  28:     assemblies:
  29:       - name: Dotnet.Commons.Logging
  30:         files:
  31:           - output    : Dotnet.Commons.Logging.dll
  32:  
  33: testing assemblies:
  34:   - location: \Nunit\2.4.3
  35:     assemblies:
  36:       - name: NUnit.Framework
  37:         files:
  38:           - output    : nunit.framework.dll
  39:         
  40:   - location: \Rhino.Mocks\3.3.0.906
  41:     assemblies:
  42:       - name: Rhino.Mocks
  43:         files:
  44:           - output    : Rhino.Mocks.dll
  45:           - document  : Rhino.Mocks.xml
  46: ...
What I've essentially got here is a hierarchical collection of hashes and arrays.  So in Ruby all I have to do is this :

require 'yaml'
refs = open('references.yml') {|f| YAML.load(f) }

and I've got direct access in code to all my data.  What threw me at first was the idea that there would be an easy XPath-like way to say "give me a list of all the output files in the custom assemblies node".  Instead, you have to write some parsing code, but it turns out it's not that hard.  Here's the Ruby code I wrote to answer my query:

   1: require 'yaml'
   2:  
   3: refs = open('references.yml') {|f| YAML.load(f) }
   4:  
   5: refs['custom assemblies'].each do |location|
   6:   location['assemblies'].each do |assembly| 
   7:     assembly['files'].each {|file| if file['output'] then puts "        #{file['output']}" end }
   8:   end
   9: end

I'm sure there are much more elegant ways to write that code, and as I learn idiomatic Ruby I'll write better Ruby code.  But for now, here's why this works:  my little config file is basically just a big, weird hash with five nodes.  The "shared paths" node contains a hash, or map in YAML terms, of 2 nodes: lib share and vendor.  The "local paths" node contains a hash with one node.  "custom assemblies" is different: it contains a list with two items, and each list item contains a hash with two nodes ("location" and "assemblies").  If we keep digging down, the "files" node contains an array of hashes - the sample above digs down to this level and prints the file name if the hash key is "output".

Share this post :

No comments: