SGML and other issues in parser for web browser

Discussion in 'Firefox' started by Surendra Singhi, Feb 22, 2005.

  1. Hello all,
    I am trying to write a parser for web browser called "w3" it is
    distributed with emacsen, but the development work on it is now moribund.
    While looking at the parser's code(it was last modified in 1996) I saw
    lot of hacks for dealing with SGML.
    I felt like removing them but before doing that I thought it would be
    wise to draw upon the wisdom of other people who are involved in
    development of browsers.
    Are there still lot of legacy web pages out there which uses SGML?
    Should I retain support for SGML or remove it(doing so will help the
    parser in speed as well as keeping the code simple). And speed is a big
    concern for this browser.
    At one point of time I was also contemplating making a parser just good
    enough for eating XHTML 2.0, but then thought XHTML 2.0 is hardly used
    by anyone, and so I should support HTML 4.1. Any opinions on this are
    also welcome.

    Also some suggestions on design of the parser are welcome.

    Surendra Singhi

    "O thou my friend! The prosperity of Crime is like unto the lightning,
    whose traitorous brilliancies embellish the atmosphere but for an
    instant, in order to hurl into death's very depths the luckless one they
    have dazzled." -- Marquis de Sade
    Surendra Singhi, Feb 22, 2005
