Package gozerbot :: Package contrib :: Module feedparser :: Class _BaseHTMLProcessor
[hide private]
[frames] | no frames]

Class _BaseHTMLProcessor

source code

markupbase.ParserBase --+    
                        |    
       sgmllib.SGMLParser --+
                            |
                           _BaseHTMLProcessor
Known Subclasses:

Instance Methods [hide private]
 
__init__(self, encoding)
Initialize and reset this instance.
source code
 
reset(self)
Reset this instance.
source code
 
_shorttag_replace(self, match) source code
 
feed(self, data)
Feed some data to the parser.
source code
 
normalize_attrs(self, attrs) source code
 
unknown_starttag(self, tag, attrs) source code
 
unknown_endtag(self, tag) source code
 
handle_charref(self, ref)
Handle character reference, no need to override.
source code
 
handle_entityref(self, ref)
Handle entity references.
source code
 
handle_data(self, text) source code
 
handle_comment(self, text) source code
 
handle_pi(self, text) source code
 
handle_decl(self, text) source code
 
_new_declname_match(...)
match(string[, pos[, endpos]]) --> match object or None.
source code
 
_scan_name(self, i, declstartpos) source code
 
output(self)
Return processed HTML as a single string
source code

Inherited from sgmllib.SGMLParser: close, error, finish_endtag, finish_shorttag, finish_starttag, get_starttag_text, goahead, handle_endtag, handle_starttag, parse_endtag, parse_pi, parse_starttag, report_unbalanced, setliteral, setnomoretags, unknown_charref, unknown_entityref

Inherited from markupbase.ParserBase: getpos, parse_comment, parse_declaration, parse_marked_section, unknown_decl, updatepos

Inherited from markupbase.ParserBase (private): _parse_doctype_attlist, _parse_doctype_element, _parse_doctype_entity, _parse_doctype_notation, _parse_doctype_subset

Class Variables [hide private]
  elements_no_end_tag = ['area', 'base', 'basefont', 'br', 'col'...

Inherited from sgmllib.SGMLParser: entitydefs

Inherited from sgmllib.SGMLParser (private): _decl_otherchars

Method Details [hide private]

__init__(self, encoding)
(Constructor)

source code 

Initialize and reset this instance.

Overrides: markupbase.ParserBase.__init__

reset(self)

source code 

Reset this instance. Loses all unprocessed data.

Overrides: markupbase.ParserBase.reset

feed(self, data)

source code 
Feed some data to the parser.

        Call this as often as you want, with as little or as much text
        as you want (may include '
').  (This just saves the text,
        all the processing is done by goahead().)
        

Overrides: sgmllib.SGMLParser.feed
(inherited documentation)

unknown_starttag(self, tag, attrs)

source code 
Overrides: sgmllib.SGMLParser.unknown_starttag

unknown_endtag(self, tag)

source code 
Overrides: sgmllib.SGMLParser.unknown_endtag

handle_charref(self, ref)

source code 

Handle character reference, no need to override.

Overrides: sgmllib.SGMLParser.handle_charref
(inherited documentation)

handle_entityref(self, ref)

source code 

Handle entity references.

There should be no need to override this method; it can be tailored by setting up the self.entitydefs mapping appropriately.

Overrides: sgmllib.SGMLParser.handle_entityref
(inherited documentation)

handle_data(self, text)

source code 
Overrides: sgmllib.SGMLParser.handle_data

handle_comment(self, text)

source code 
Overrides: sgmllib.SGMLParser.handle_comment

handle_pi(self, text)

source code 
Overrides: sgmllib.SGMLParser.handle_pi

handle_decl(self, text)

source code 
Overrides: sgmllib.SGMLParser.handle_decl

_new_declname_match(...)

source code 

match(string[, pos[, endpos]]) --> match object or None. Matches zero or more characters at the beginning of the string

_scan_name(self, i, declstartpos)

source code 
Overrides: markupbase.ParserBase._scan_name

Class Variable Details [hide private]

elements_no_end_tag

Value:
['area',
 'base',
 'basefont',
 'br',
 'col',
 'frame',
 'hr',
 'img',
...