HTML5: What Is It?
The simplest definition, an HTML5 document is an HTML document that only uses nodes (tags) defined as part of the HTML5 specification in a manner consistent with the HTML5 specification.
The HTML5 specification can be found at http://www.w3.org/TR/html5/.
The Basics of an HTML5 Document
Look at the following code:
<!DOCTYPE html> <html> <head> <meta charset="UTF-8"> <title>This is an HTML5 Web Page</title> </head> <body> <p>Hello World!</p> </body> </html>
The above code would validate as valid HTML5, making it officially an HTML5 document.
The XHTML version of HTML5 is very similar except it is sent as XML and not HTML. By design, the only differences between the HTML and XHTML versions of the HTML5 specification are when it is mandated for the document to be seen as valid XML by XML parsers.
In HTML (5 or otherwise), when a node (such as the meta node above) is never allowed to have children, it is not closed. Applications that parse HTML understand that there are no children and the tag itself is all there is. XML however requires that every node be closed. Nodes that are never allowed to have child nodes (such as meta, img, br) must be self closing. For example:
<meta charset="UTF-8" />
Addition of the / at the end of the node tag declaration indicates to the XML parser that it is a self closing node.
XML also does not recognize most named entities that HTML does. You can define named entities for XML in a document type declaration, but the HTML 5 specification does not define a document type the parser can use to fetch the named entities. Thus, XHTML pages should not use named entities. Instead, the UTF-8 equivalent or a numbered entity should be used.
The DOMBlogger engine serves content as XHTML5 to clients that report an ability to handle the application/xhtml+xml mime type, and as HTML5 to clients that do not. The DOMBlogger engine also takes care of self closing tags, converting HTML entities to UTF-8, etc.
But HTML5 can be so much more than just a simple web page as described above.
More Than Just A Web Page
HTML5 recognizes that web browsers are used for more than just viewing text content maybe with a few images, but are used as interactive dynamic platforms from which full blown applications are launched.
The world did not wait for HTML5 to start using web browsers as dynamic platforms. Very early on, browser plugin architecture allowed for interactive content inside a web browser, typically using Java or Flash.
The problem is that these plugins are not universally available across all platforms. They often pose security risks, and the quality of the plugin can vary according to version and platform. HTML5 attempts to solve this problem by providing a mechanism through which dynamic interactive content can be served to the user without the need for 3rd party plugins.
We may never live in a world without sin, but a world without browser plugins is obtainable.
HTML5: Web Application Technologies
Artwork Credit
The HTML5 Logo and related technology logos used in this section of this article are taken from w3.org and are used under the terms of the C.C. 3.0 Attribution License.
HTML5 provides standards for several different technologies to assist in the creation of dynamic web applications in a world without browser plugins. The DOMBlogger platform does not take advantage of all of these technologies. Some are presently implemented, some are being worked on, and some probably will not be implemented.
Just because a technology exists does not mean it should be used. The point of delivering content on the web is usually to make the content available to as many people as possible. One must be careful that they do not exclude some people because new technologies were embraced before the masses had adopted a means to take advantage of said technology.
The following sections give brief explanations of the core standard components provided by HTML5, whether DOMBlogger uses them or not:
Semantics
HTML5 provides many new tags for page creation via semantic markup. These new tags make it easier for third party applications to understand the structure of the document.
For example, the new nav node is used to indicate that the links inside are navigational links within your web site. The new figure node indicates the content within is a figure related to main content but is not itself the main content. Some content conversion tools may want to float the figure to the end, as is often done in print publishing. HTML5 semantics give a way for these applications to better understand what is what within the document structure.
Semantic market is a benefit to web scrapers, browsers for the visually impaired, etc.
The DOMBlogger engine takes full advantage of semantic market in content it generates.
Offline & Storage
HTML 5 provides technologies for assisting in the use of web applications when the client does not have a current connection to the Internet. DOMBlogger does not take advantage of this technology and probably will not.
This feature of HTML5 is better suited for applications such as word processors, spreadsheets, and games than it is for a content management system.
It is tempting to take advantage of this feature to allow drafts while using a WYSIWYG editor. However, there is better access to the draft if it is stored on the web server.
Device Access
The concept behind device access is that devices can tell the web application things about the device actually using the web application. If the device can sense temperature, for example, it could potentially notify the web application what the current temperature is where the device is being used. The tilt of the screen, the geographical location, all kinds of groovy information can be sent to the web application.
To be honest, it scares the hell out of me. I do not want such information being sent to a web application without my specific authorization, but HTML5 provides standard interfaces for allowing devices to send this information.
None of my devices have ever asked if they can send this kind of information to a web application, hopefully that means that they don't.
There are presently no plans to incorporate any of that technology into the DOMBlogger engine. It is possible the technology may be used in some applications that bolt on to DOMBlogger, but there really is no need for any of that in the engine itself to use it. Bells and whistles are not needed, especially bells and whistles that give away too much information. Never thought I would want SELinux for Android, but maybe it is necessary to make sure web clients can not access features I do not want them to. Scarry world.
My guess is the technology will largely be used by advertisers and stalkers, hopefully in that order.
Connectivity
Connectivity allows for better pushing of data between server and client through the use of Web Sockets and Server Sent Events.
This is particularly useful for things like chat engines, social networks, and games. The DOMBlogger engine does not currently have any code that falls into this category, but it is on my list of things to do, specifically a full featured chat engine but possibly more.
Here is a real world scenario (happened to me) demonstrating the benefit:
A customer of yours is having a problem with a product he purchased. He goes to your web site and follows the trouble shooting procedure. His problem is still not resolved, so he clicks on your link for live chat support.
First he is presented with a web page telling him he needs to install Java. Installing Java means a large download and restarting his browser. In addition to already being frustrated that the product is not performing to his expectations, he is now annoyed that your web site has made him install software and restart his browser.
Now his browser is restarted and he tries to access your chat. Java starts to load, and he sits there waiting and waiting and waiting for it to load. Then the Java plugin crashes his browser.
You just lost a customer for good, and when it happened to me, I actually left very bad feedback in the area of customer service on a prominent product review web site.
Using HTML5 technology for support chat could have allowed your customer to access chat on his first attempt and allowed him to receive the professional quality support you work hard to provide, instead of having technical hiccups related to third party browser plugins get in the way.
Reduce the software required for things to go right and they are less likely to go wrong.
Multimedia
Multimedia has been a cross-browser cross-platform issue for decades. Different types of audio and video required different types of plugins to adequately play, and these plugins were not necessarily available on every browser for every platform.
HTML5 browsers can play video and audio natively without relying upon a third party plugin, but there are some caveats:
Some browsers only support decoding video formats that do not require payment of patent royalties to distribute decoding technology. FireFox and Opera are examples of these browsers. For video they support Ogg Theora and newer versions support WebM. For audio they support Ogg Vorbis.
Other browsers, for reasons that do not seem to make sense, have chosen not to support the patent unencumbered formats. Instead they opt for H.264/AVC for video and either AAC or MP3 for audio.
Fortunately, the HTML5 media tags allow media to be presented in more than one codec. For example:
<audio id="FurElise" controls="controls"> <source src="/audio/FurElise.ogg" type="audio/ogg"> <source src="/audio/FurElise.mp3" type="audio/mpeg"> <!-- object code for flash fallback --> </audio>
Result of above code:
Source: Wikimedia Commons
If the browser knows how to handle the audio/ogg mime type, it will use the Ogg Vorbis version of the audio file.
Otherwise, the browser will look at the next source node. If it knows how to play audio/mpeg it will use the MP3 version of the file.
Fortunately, every browser that supports HTML5 media but does not support Ogg Vorbis does support MP3, so by having two versions of the audio file, the audio will be playable by anyone who has an HTML5 capable browser regardless of what plugins they have installed.
Browsers that do not support HTML5 media simply ignore the audio, video, and source tags, as they are meaningless to those browsers. You thus can include a fall back method that uses a browser plugin inside the media tag, and they will hopefully play it.
DOMBlogger Support
The DOMBlogger engine will generate the HTML5 media tags for you (including flash fallback code), and will also perform audio transcoding for you.
When you upload an audio as a WAV, AIFF, or FLAC file, the lossless file will be encoded into both Ogg Vorbis and MP3 on the server. When you upload an MP3 or Ogg file, while not ideal, the lossy file will be transcoded into the other format for you.
Unfortunately due to prohibitively expensive licensing costs related to patents, the DOMBlogger platform can not decode or encode the H.264/AVC videos. It can host them, but it can not transcode them to or from Ogg Theora or WebM.
If you use the Apple Macintosh or Microsoft Windows platform for your personal computing, you can probably do the transcode yourself. Detailed instructions on how to transcode from H.264/AVC (what most cameras record to) and Ogg Theora will be written so that you can upload both an H.264/AVC and Ogg Theora version of any video content you wish to share.
DOMBlogger uses the FlowPlayer flash based media player as a fallback for web browsers that do not support HTML5 media, and generates the code necessary to use FlowPlayer as a fall back automatically.
In order for the FlowPlayer fallback to work with video, an H.264/AVC version of the file has to exist. This is usually not a problem because most modern video cameras record to that format natively.
The code generated by DOMBlogger does not use any client side JavaScript, like some popular solutions on the web do.
3D, Graphics & Effects
HTML5 provides for amazing interactive graphics capabilities through SVG and Canvas. Unfortunately neither is supported by Internet Explorer versions below IE 9 which means users of Internet Explorer on Windows XP can not use web applications that are built upon them.
As a content management system primarily focused on text content, DOMBlogger does not currently make use of either. It is however probable that some applications or widgets that make use of dynamic SVG with PNG/GIF fallback will be made available (IE for graph generation), though probably not with user interactivity until Windows XP has reached end of life ().
Performance & Integration
HTML5 provides several standardized mechanism by which dynamic interaction can take place between the client and server without the web page needing to reload, including Web Workers and XMLHttpRequest.
Web Workers looks like it has the potential to be a bit of a resource hog on the end users machine, and DOMBlogger will not have any part of that. There are too many well intentioned scripts that add bling but have no actual useful function (just look at all the recent Facebook changes ...) end up with bugs that can cause major badness for end users, sometimes even bringing the users browser or even system to a crawl.
While I am sure Web Workers has its proper place, I am also sure it will often be used where it should not be used by developers who are not as clever as they think they are, have forgotten KISS, and really should have solved the problem (if there even was one) a different way.
XMLHttpRequest on the other hand is the heart of Ajax and provides for both an enhanced user experience and less network traffic and is definitely a part of DOMBlogger.
For a basic example of the benefit: Click through Age Verification (CAV). CAV is the process by which a web site identifies itself as unsuitable for younger individuals (IE alcoholic beverage web sites) and asks the user to verify their age by clicking a link. I know, not very effective at actual age verification, but it is good PR and there actually has been some legislation in some locations requiring a warning page that allows the user to exit before being exposed to content of an adult nature. CAV at least provides this, even if it is does not robustly verify the age of the individual requesting the content.
The traditional and still very commonly used method for implementing CAV involves initial loading of the web page the client requests followed by a JavaScript redirect to the CAV page. When the user verifies their age and desire to see the content, the user is then returned to the original page requested. The reason for the URL gymnastics is so that search engines (which do not know how to click to verify) can properly index the content. However this method results in a lot of un-necessary page loading.
Using a very simple call to XMLHttpRequest, CAV can be done on the page initially requested and completed without the browser needing to reload any additional pages.
Before the script does anything, the normal page content is already loaded but hidden under a div node containing the CAV. Using this method, the div is actually at the end of the document so that it does not interfere with search spiders, but CSS is used to pin it over the normal content so that it is all a human user (with a graphical browser anyway) will see. The user clicks to verify their age and the script communicates to the server that the user verified their age and desire to see the content. The CAV div is then removed from the DOM allowing the user to see the content they requested. Less page loads than the traditional method, less network traffic, and the originally requested content is instantly available.
It is important to note that good web programming does not rely on JavaScript. DOMBlogger is coded in such a way that the web site continues to function even if JavaScript (and thus XMLHttpRequest) is disabled. Users surfing with JavaScript disabled will not benefit from the increased performance that XMLHttpRequest has to offer, but the site will still be fully functional.
Cascading Style Sheets
Cascading Style Sheets (CSS) tell your browser how it should render the provided content.
CSS3 is not yet ready for widespread use, most of it still exists as drafts and have not been submitted as W3C recommendations.
Users of the DOMBlogger framework can use their own style sheets if they wish to implement CSS3 within their site design, but the templates DOMBlogger currently uses do not make use of anything CSS3 specific and probably will not for some time. DOMBlogger tries to comply with CSS Level 2.1.
CSS3 can do a lot of cool things. One thing that worries me is I have seen it used in some places where it really is not the appropriate technology to use, such as rendering a company logo (really should be done in SVG, not with CSS).
I am hoping that in the near future, many jQuery animations that currently set the style attribute can instead reference a CSS3 style sheet for equivalent animated effect, thus needing less DOM refreshes for the same effect.