class

HTML::Tokenizer

v2.2.1 - Show latest stable - Superclass: Object

A simple HTML tokenizer. It simply breaks a stream of text into tokens, where each token is a string. Each string represents either "text", or an HTML element.

This currently assumes valid XHTML, which means no free < or > characters.

Usage:

  tokenizer = HTML::Tokenizer.new(text)
  while token = tokenizer.next
    p token
  end

Attributes

[R]line
[R]position

Files

  • actionpack/lib/action_controller/vendor/html-scanner/html/tokenizer.rb