Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
It's funny.  Laugh.

Obfuscated HTML Contest? 81

ptaff asks: "We all know the nightmare of typical HTML developer: you get different results on different browsers/platforms (and we're talking HTML only, no CSS/scripts). To make matters worse, MSIE has this ability to render completely invalid HTML code (missing tags, invalid nesting, you get the point). Mozilla and its many cousins are trying hard to keep up with the inconsistencies of today's 'web-optimized-for-MSIE', but where is the limit? As an exercise, can you build the most malformed HTML document that can be rendered in MSIE but will choke on others browsers?"
This discussion has been archived. No new comments can be posted.

Obfuscated HTML Contest?

Comments Filter:
  • Theory & practise (Score:4, Insightful)

    by RyoSaeba ( 627522 ) on Thursday December 12, 2002 @07:11AM (#4869277) Journal
    Well, i guess that's the difference between theoritical stuff (HTML standard) and the practical implementation (browsers)...
    Maybe also the time required for a feature to become standard HTML plays a role, think people are gonna wait some months to have a feature, when the browser (broken or anticipating the new standard) can make it already ?

    Isn't that after all also how the Internet itself works usually ? ie people do something in different ways, usually without any standard, or extending one, then some mix of everything becomes 'the' new standard (RFCs & so on) ?
    • RFC=Request for Comments

      Meaning, they care what people have to say. They want to get community input to make this a community standard, not just something they make up.

      Has MS ever made an RFC?
      • AFAIK, no, but the mere fact that their ideas become part of HTML proves that the community liked them, no ?
        • the mere fact that their ideas become part of HTML proves that the community liked them, no ?

          No.

          It could merely be that the only reference "the community" used was written by the same company that made the browser that "the community" used to "test" the page with, and the same company also wrote the software they used to "write" the page with.

          IOW... some company took advantage of the clueless drones that they created. That is, after all, their biggest asset.

          After that, other browsers have to implement the same "errors" because it is easier than educating managers that *their* own sites are wrong and the one *you* made isn't... Even tho *their* site works in *their* browser, and *your* site does not.
  • by Myself ( 57572 ) on Thursday December 12, 2002 @07:28AM (#4869355) Journal
    Why not make it render something different but valid in as many different browsers as possible?

    My sympathy goes out to the judges of this contest.
  • by 3-State Bit ( 225583 ) on Thursday December 12, 2002 @07:44AM (#4869422)
    View source. Go ahead. Right now.
    I dare you to glance through it.
    You'll not sleep tonight.
  • are you looking for the greatest variety of broken tags, or the greatest number of broken tage, or what? i'm not sure how you could define one page as more malformed than another. i could make a page with 1,000,000 broken td tags. would that win?
  • This may seem pointless to many people here, but this actually serves a purpose: the creators of the browsers can use this code to analize the shortcomings of their browser.

    I _know_ mozilla is more standards compliant than ie, but this is not about standards. It's about acceptance by the masses. The more sites that are rendered right, the better the chances are.
    • by Captain Large Face ( 559804 ) on Thursday December 12, 2002 @09:18AM (#4869936) Homepage
      I strongly disagree. HTML standards are standards for a very good reason -- it allows ALL producers of HTML clients AND HTML editors to aim for a common goal.

      Following the "standards" as laid down by Internet Explorer will mean those writing HTML documents will continue in bad habits learnt during the so-called browser wars between Microsoft and Netscape. If you take 10 random sites and check the source code of the home page, I'd wager than none of them are using valid HTML 4, although the standard has been public for over four years!

      • Well, given the fact that one can get by quite nicely without following the standard, very few people will care about this. It would be nice, IMO, if browsers were less tolerant of sloppy standards violations. If my HTML says it's HTML 4.0 compliant, and I do something not in the standard, then the browser should just throw it back up with an error message.

        That would solve a lot of problems. It woulc create some problems for people who shouldn't be coding in the first place, but I won't lose sleep over that one.
        • On the other hand, we have a standard for english grammer and not too many around here follow it. (Well, not me at least. :)

          But the message is usually rendered correctly. :-)
          • >On the other hand, we have a standard for english
            >grammer and not too many around here follow it.
            >(Well, not me at least. :)

            s/grammer/grammar/

            >But the message is usually rendered correctly. :-)

            No offence, but bad spelling and grammar make you look dumber. This can distract the reader from your message, and affect your credibility.

            Gratuitously bad HTML makes a person or company look dumber. This can distract the reader from their message, and affect their credibility.
            • Still got my point tho', didn't ya? ;)
              • >Still got my point tho', didn't ya? ;)

                Yup, I shore did.

                But if you are trying to sell me something, and you can't spell, I wonder how good your product is.

                If you are trying to convince me of something, and you have poor grammar, I wonder if your ideas are well thought out.

                If you are trying to present tech info (a HOWTO etc.) with poor spelling/grammar, I wonder if your facts are sound.

                It all comes back to credibility, I reckon. Ain't life a bitch?
  • by redcliffe ( 466773 ) on Thursday December 12, 2002 @08:16AM (#4869555) Homepage Journal
    That's totally obfuscated......
  • I once (a few years ago) inherited a web project that was managed by Net Objects Fusion. That was bad enough, except that the hosting server only allowed uploading via Frontpage Extensions (go figure). So once any updates were done, the pages had to be exported out of Net Objects, then brought into the Frontpage project so they could be uploaded.

    If you can imagine the HTML that came out of that little combo. Not pretty.

    I also saw one site that looked to be a combo of MS Word and Net Objects. I still have nightmares about that one.........

  • by Bazzargh ( 39195 ) on Thursday December 12, 2002 @08:39AM (#4869685)
    I've always though this was fantastically obscure, and uses a hellish mix of applet, object, and embed tags to make things work. Remember what appears below is recommended practice!

    Old Style:

    <APPLET code=XYZApp.class codebase=html/ align=baseline width=200 height=200> <PARAM NAME=model VALUE=models/HyaluronicAcid.xyz> No Java 2 SDK, Standard Edition v 1.3 support for APPLET!! </APPLET>

    New Style:

    <EMBED type=application/x-java-applet;version=1.3 width=200 height=200 align=baseline code=XYZApp.class codebase=html/ model=models/HyaluronicAcid.xyz pluginspage=http://java.sun.com/products/plugin/1. 3/plugin-install.html> <NOEMBED><XMP> <APPLET code=XYZApp.class codebase=html/ align=baseline width=200 height=200></XMP> <PARAM NAME=java_code VALUE=XYZApp.class> <PARAM NAME=java_codebase VALUE=html/> <PARAM NAME=java_type VALUE=application/x-java-applet;version=1.3> <PARAM NAME=model VALUE=models/HyaluronicAcid.xyz> <PARAM NAME=scriptable VALUE=true> No Java 2 SDK, Standard Edition v 1.3 support for APPLET!! </APPLET></NOEMBED></EMBED> </OBJECT>
    • You know that embed is no longer in the HTML w3c specs, and the applet ist deprecated in HTML 4.01 and not in the XHTML 1 strict? It is deprecated and not allowed, but still needed by many browsers. The reason you have to write that embed object applet kombination comes from the browser wars.

      But you are right, that's really bad code and it is needed to make it cross-browser compatible. Sad part is that it wont validate against any common HTML standard - but i don't know who to blame for this, w3c had to set one tag as standard for embedding objects (Media-files, Flash, JavaApplets, you name it) and browsers exist that wont be changed for some time cannot be blamed. I wonder when Mozilla will display a flash movie without using the embed tag. so this obfuscated (or redundant) code will be in pages for a long time from now on...

      • Yes I know this. Theres an article on devedge [netscape.com] which explains the issues pretty well. The 'obtainment' issue the devedge writer describes is interesting, I have to assume that W3C felt that pages should link to the download page for the plugin if the plugin was unavailable, while MS felt plugins should download and auto-install.

        Sun's description [sun.com] shows you the nitty gritty, but explains less of the 'why'.

        The worst of this is the jsp:plugin tag, which generates code for the 1.3 plugin. Saves a lot of typing, but won't it get out of date pretty fast?

        -Baz
      • Mozilla already will display flash without using the embed tag. The object tag works fine for flash with moz, 1.1+ anyway, dunno about 1.0.

  • by Anonymous Coward
  • by jolshefsky ( 560014 ) on Thursday December 12, 2002 @09:09AM (#4869884) Homepage
    Step 1: type "Hello world." in a Microsoft Word document.

    Step 2: save as a web page.

    The result:

    <html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word"
    xmlns="http://www.w3.org/TR/REC-html40">

    <head>
    <meta http-equiv=Content-Type content="text/html; charset=windows-1252">
    <meta name=ProgId content=Word.Document>
    <meta name=Generator content="Microsoft Word 9">
    <meta name=Originator content="Microsoft Word 9">
    <link rel=File-List href="./Hello%20world_files/filelist.xml">
    <title>Hello world</title>
    <!--[if gte mso 9]><xml>
    <o:DocumentProperties>
    <o:Author>Administrator</o:Author>
    <o:LastAuthor>Administrator</o:LastAuthor&gt ;
    <o:Revision>1</o:Revision>
    <o:TotalTime>0</o:TotalTime>
    <o:Created>2002-12-12T13:01:00Z</o:Created&g t;
    <o:LastSaved>2002-12-12T13:01:00Z</o:LastSaved>
    <o:Pages>1</o:Pages>
    <o:Company>Yoyodyne Propulsion Systems, Inc.</o:Company>
    <o:Lines>1</o:Lines>
    <o:Paragraphs>1</o:Paragraphs>
    <o:Version>9.4402</o:Version>
    </o:DocumentProperties>
    </xml><![endif]-->
    <style>
    <!--
    /* Style Definitions */
    p.MsoNormal, li.MsoNormal, div.MsoNormal
    {mso-style-parent:"";
    margin:0in;
    margin-bottom:.0001pt;
    mso-pagination:widow-orphan;
    font-size:12.0pt;
    font-family:"Times New Roman";
    mso-fareast-font-family:"Times New Roman";}
    @page Section1
    {size:8.5in 11.0in;
    margin:1.0in 1.25in 1.0in 1.25in;
    mso-header-margin:.5in;
    mso-footer-margin:.5in;
    mso-paper-source:0;}
    div.Section1
    {page:Section1;}
    -->
    </style>
    </head>

    <body lang=EN-US style='tab-interval:.5in'>

    <div class=Section1>

    <p class=MsoNormal>Hello world.</p>

    </div>

    </body>

    </html>
    Of course, it breaks the rules because it uses style sheets, but who's counting...
    • Break what rules? Style Sheets is the standard for HTML formatting according to W3C [w3.org].
      • Wow, here's the quote from the top of THIS page that you somehow glaringly missed...


        (and we're talking HTML only, no CSS/scripts)


        Sorry but this article obviously wasn't posted to insight yet another standards debate.
        • that you somehow glaringly missed...
          I didn't miss anything. There's nothing obfuscated about that code he posted as it is W3C HTML 4.01 Strict compatible - assuming you remove that XML crap. So, if (following the rules of the original post) you remove style sheets and the XML, you have HTML standards compliancy.
          • Yeah, but obfuscated doesn't mean non-standards-compliant. It means hard to read.
          • >I didn't miss anything. There's nothing obfuscated about that code he posted as it is W3C HTML 4.01 Strict compatible

            The posted code (MS Word HTML export) is definitely not W3C HTML 4.01 Strict. (Transitional? Most probably).

            Furthermore, the XML code segment is not valid inside an HTML 4.01 document. So, it should have been XHTML and by no means is it XHMTL compliant (no quotes around attribute values, empty elements are not closed etc.)

            However, at least some of the following aspects are much better compared to prior versions and in some respects even Mozilla Composer.

            • Tags are in lowercase.
            • All tags have closing tags.
            • Use of CSS.

            Thanks for your patience.
            2002-12-18 17:46:26 UTC (2002-12-18 12:46:26 EST)

  • by Clover_Kicker ( 20761 ) <clover_kicker@yahoo.com> on Thursday December 12, 2002 @09:12AM (#4869905)
    Here [infotrope.net] is a classic.

    It just looks dumb in Mozilla, but you can use IE to truly experience the horror.

    I believe this was originally designed as an object lesson that HTML email and usenet posts are a bad idea.

    There is no author identified, but I'd love to know who came up with this one.
  • by HTD ( 568757 ) on Thursday December 12, 2002 @09:13AM (#4869913) Homepage

    I think it's better to find pages that use such code. example - the www.europcar.com .de .fr pages the use a javascript menu that does ONLY work in MSIE on windows. No MacIE, Mozilla(choose your platform), Opera 7 or other alternative browsers. You simply cannot see the menu or cannot use it - therefore you cannot navigate. There are more pages out there, writing this code on purpose is pointless, because it has already been written ;) Find those pages and complain, make a publicly available list of invalid non-working HTML pages. Write the webmasters about your problems. And of course show workarounds so that those "programmers" can see and change their mistakes.

    A good reason for coding obfuscated (be it valid or invalid) HTML would be to create a repository of "real world" code for Browser developers out there to check if it works with their product. Then of course a "desired output" image should be attached to the code.

    Creating a blacklist of corporate pages using invalid html is my favourite idea, but the mentioned repository would help a lot coders out there...
  • Check some of their entries for obfuscated HTML and JavaScript.

    They even have a 5k version of Wolfenstein.

    And most of the entries work in IE only. :(

    Joe
    http://josephgrossberg.blogspot.com [blogspot.com]

    • I checked out the 5k site. Most of the entries were Flash, not that impressive. But the 5k Wolfenstein [the5k.org] (will only work in IE on Win, yadda yadda) game you mentioned... (h,j,k,n to move, space to shoot).

      Yikes! It's a small fps entirely done in JavaScript(!), including multiple independantly moving foes, and the ability to shoot them. And in less than 5 kB! I spent an hour or so reverse-engineering the program.

      As far as I can tell, it works by generating the (1 bitplane BW) graphics into an array p, then creating a JavaScript source code string that contains a definition of the image (im="... static char t_bits[]={(things based on p)}"), then inserting that back into the page with document.images[0].src="javascript:count;im;", where im is the name of the variable containg the above-mentioned string...

      Do check out the source code! This is heavy !!

      PS. I played it a little bit more. Oh no, the thing even has scoring, multiple levels with increasing numbers of foes... (/me looks suprisedly at his once rampant, now wilting ego.)

      PPS. Oh, and Window Pong [the5k.org] (keypad numlocked 8+2) was good for a laugh, and seems more compatible.

  • Doing my part (Score:2, Interesting)

    by mnmn ( 145599 )

    I'm setting up a value web-hosting system in the next 6 months using Fractional T1.. and one of the plans is to run all submitted HTML code through the validator script, and add a warning message at the bottom of the page if it has errors. This will be mentioned in the SLA.

    Just doing my part to put the standards back into the web.
    • Not a smart idea. Sure, you can run it through an HTML validator, but give them a warning message, don't prohibit them from using it. That's just plain mean. What if I don't like using lowercase tags? Uppercase is so much easier to use. And what if I don't like <strong> or <em>? It's a nice concept, but don't make it mandatory.
  • to write Strict XHTML DTD based Obfuscated XHTML, that chokes one XHTML browser and works on the other XHTML browser. And offcourse both of the browsers should funnly support Strict XHTML DTD http://docbook.sc-icc.org
  • Dreamweaver routinely writes code that does not work on many versions of netscape/mozilla/phoenix (especially for linux).. Their built-in javascript stuff is the biggest culprit..
  • Slashdot wins the contest! All praise! Yay!

    (waives around Slashdot-logo emblazoned flag)
  • New rules:
    1. The source must validate. This rule applies to both (X)HTML [w3.org] and CSS [w3.org]
    2. Allowed doctypes are HTML 4.01 Strict, XHTML 1.0 Strict and XHTML 1.1. For styling, anything up to CSS2 is allowed. Conditional comments [microsoft.com] are disallowed because they would make the contest too easy.
    3. Page must be readable with Mozilla 1.2 and Opera 7.0 (beta)
    4. The winner the one with the most artistic rendering in MSIE6/win32, combined with unreadable source.
    5. Extra points, if page is still readable in the Netscape Navigator 4.x.
    6. No scripting is allowed.

    I think you could get pretty interesting results by layering elements one over another and creating resulting images with interfere patterns caused by letters laid over other letters. Use CSS features that MSIE doesn't implement, or has bugs in, to correct the positioning in correctly behaving browsers and @import trick for keeping NN4.x in the game.

    Creating page that works only in one nonstandard browser is too easy. Creating standards compliant page that works in every browser but one buggy one should be hard enough.

    • (from the conditional comments [microsoft.com] page)

      uplevel browser
      Internet Explorer 5 and later versions.

      downlevel browser
      Any browser except for Internet Explorer 5 and later versions.

      The arrogance of "downlevel" infuriates me. They couldn't just say "other" or "non-Microsoft". The implicit assumption is that if it's not using IE, it's crap.
    • Just a straight hard to read code:

      I got a an ascii-art from the fortune on my rh7.2 machine of an atomic bomb blast (just happened to be the one the cgi-bin picked to spit out) that had every space repalced by a &nbsp;. (oh and of course the & replaced by &amp;, the > replaced by &gt; ^I characters replaced by 8 spaces, which happens right before the &nbsp; part, ect). Only styling was a <P STYLE="font-family: monospace"> so it would line up properly. Plays just fine in Mozilla 1.2, and Lynx, Netscape 7, Opera 5 on my box. Haven't tested in any others, but when run against the w3 HTML 4.01 Strict validator it works, although the source hurts to read (although if I knew a bit more with sed/aw/tr ect I could make it even harder to grok the HTML source).

      If you really wanted your HTML to be hard to read you could always give the ascii number (or unicode for fun) of every character on the page, so reading the source you would just see the tags in the clear. AB is written is HTML as &#65;&#66; and it is perfectly HTML/4.01 strict valid from my tests.

      Just my 2 bits.
      • If you really wanted your HTML to be hard to read you could always give the ascii number (or unicode for fun) of every character on the page

        I wouldn't rate such hack as a good contestant simply because the method is way too simple. But if you prefer to do such a thing, just use this perl script [uni-sb.de].

        Note that HTML tidy can easily clean up such simple hacks. Truly unreadable source cannot be fixed with something as simple as HTML tidy. You can try the above perl script on some HTML file and then inputting that file to HTML Tidy Online [infohound.net].

        And just for the record, numeric character entities always refer to unicode character code positions [cs.tut.fi]. For example, &#151; (0x97) is undefined (reserved), even though many people try to use that in HTML source to represent emdash [www-f9.ijs.si].

  • The paradigm of interoperability has always been:
    Be conservative in what you generate, liberal in what you accept.
    In other words, only generate documents that are standards-compliant. But in accepting documents, you shouldn't be penalized for liberally accepting things that are not kosher by the standards.

    I don't like Internet Exploder. I don't really like Netscrape, either. But I won't fault either for rendering a page that's not completely standards compliant; I'd guess that 95% of the pages out there wouldn't render if the browsers were as strict as, for example, the HTML validator [w3.org].

  • Doesn't Obfuscated Incompatible Code happen every time you open FrontPage?
  • This is backwards... (Score:3, Interesting)

    by Da VinMan ( 7669 ) on Thursday December 12, 2002 @03:07PM (#4873265)
    Obfuscated HTML?! Anyone can do that! Sorry, but most HTML out there is fairly crappy.

    Wouldn't an un-obfuscated HTML contest where the code is judged by how well it plays and demonstrates advanced features on multiple browsers be more challenging?

    Some reusable bits may actually come about as the result of this sort of contest.
  • I'm not sure why you would purposely come up with bad code ... but if you want to see some good examples of bad html just surf to any goeshities [geocities.com] page and enjoy the horror that it is.

  • You could have a whole "weight class" for obfuscated CSS, since browser support for that standard is so uneven. (As usual, it's getting better, but I still deal regularly with users still "standardized" on Netscape 4.7x by their support people.)

After the last of 16 mounting screws has been removed from an access cover, it will be discovered that the wrong access cover has been removed.

Working...