In Hpricot you can call xpath on a node to get the XPath that will retrieve that node from the document. In Nokogiri that equivalent is path.
I ran into this trying to figure out the xpath to a node in an HTML document. My normal routine is to load up the document in IRB and poke around to find the things I need.
In Nokogiri 's are converted to whitespace, but they are not a normal space and aren't removed with the standard String#strip and friends. Tenderlove on IRC gave me the following snippet to remove them:
Nokogiri::HTML.parse(" y").at("p").inner_text.gsub(/\302\240/, ' ').strip == 'y'
I incorporated this right into String#strip and String#strip! because in the context of my application these are whitespace.
class String
alias_method :old_strip, :strip
def strip
self.gsub(/^[\302\240|\s]*|[\302\240|\s]*$/, '')
end
def strip!
before = self.reverse.reverse # TODO there must be a better way to do this. Don't have time. -Mark 2/9/09
self.gsub!(/^[\302\240|\s]*|[\302\240|\s]*$/, '')
before == self ? nil : self
end
end