Roger Johansson over at 456 Berea Street, reflecting on a series of articles by John Allsopp regarding HTML semantics, asks the question: “Should there be another way of extending and improving the semantics of HTML without requiring the specification to be updated?”

Personally, I think the issue revolves around the misuse of HTML to mark up something other than research papers.

It is my understanding that HTML is a subset of SGML, a markup language used to mark up research papers for mass reproduction on offset printers. As such, the vocabulary (the tags) in HTML reflect the type of data being marked up. Consequently, when HTML is used to mark up documents that are not academic in nature (are not research papers), authors are left cobbling together solutions to retain the semantic value, but that rarely works. For example, if you want to mark up a mathematic equation, you’ll need the MathML specification precisely because HTML doesn’t have the vocabulary necessary for describing the content.

I find it a little ironic that Tim Berners-Lee has basically turned everyone into an academic in some sense, by enabling them to do massive research and post their findings. However, current technology limits us to “browsing” research papers, even though we’ve creatively found ways to publish much, much more than that.

I think the world is missing a browser that is able to render a variety of markup languages (vocabularies), including HTML, MathML, XHTML, XHTML2, XForms, SMIL, and others (although the last 2 are not technically markup languages). I can imagine a world in which marketers define their own markup specification for sharing data (a problem I think microformats are trying to solve) safely. In fact, markup languages can be defined for nearly any field. The problem is, we don’t have web browsers capable of rendering the data in the source documents in any meaningful fashion because no formatting information is associated with any of the elements of these foreign markup languages. In fact, I find it hard to imagine what a marketing database or recipe list would look like if not some kind of document.

So, in conclusion, I’m not sure if I’ve made my point, but basically I think any semantic improvements in HTML will come from focusing on the domain it was originally intended for (academia) than by trying to extend it to other domains that have little or nothing to do with writing research papers.

As a young boy, whenever I asked for the definition of something I was told to “look it up”. I hated that answer! It seemed so futile: if the task was to get something done and you knew the answer, why should I look it up?

It is clear to me now that I was told to look it up as a young boy to get me into the habit of being independent, of being able to fend for myself, and probably more importantly, not bothering busy people when the answer was available elsewhere.

In fact, that response has ended with me:

Learning to read by looking up words in the dictionary.

Learning to juggle by dropping lots of balls.

Reading El Quijote in Spanish the same way I learned how to read.

Learning to connect to the Oxford English Dictionary hosted at the University of Illinois Urbana-Champagne from home via a 2400 baud modem and a connection to the dial-up lines at UIC-Chicago the same way I learned how to read (and by bothering more than one techie…).

Along the way I also learned:

  • how to use PowerPoint (via Aldus Persuasion, which in my opinion was infinitely more powerful)
  • how to write HTML
  • how to write JavaScript
  • how to program in Perl, PHP, XSLT, ASP, JSP, JavaScript, AppleScript, Java, .Net, Fusebox, VBA, C, C++, Bash, SQL (Server), HyperCard, Director, Authorware, MS Office Macros, Photoshop Actions, Flash (yes, I even learned how to program in Flash), and more that I’m forgetting…

And remember, I started out not knowing how to read.

I guess my teachers weren’t so dumb after all.

And the next time you consider asking someone you know for the answer to a question just as easily answered by Google, consider looking it up first. Just look where it might lead you!

Based on these instructions on how to compile mod dav svn, I managed to get our old Red Hat server serving our public subversion repository.

I’m a little surprised by how little documentation there is on how to do this considering it is such a great way to make a code repository available to authorized users. I was unable to find any clear information on how to do this on subversion.tigris.org or on apache.org.

So, how do I know what name to use with the Apache configuration option –enable-MOD_NAME? The configure option –enable-mods-shared=all is a nice shortcut, but not very realistic in a real hosting environment. I’ve read in several places that you should only enable the modules you are really going to use and enabling all just seems like a bad idea. Can anyone help?

WARNING: major rant!

I understand why some people like the term “web 2.0” and its ilk. The problem is it leads people with little or no idea of the web to believe it is a product produced by a single organization and that it will have incremental “releases” when it is the exact opposite and evolves daily.

Furthermore, the concepts the term tries to define have been around since the beginning of computer mediated communication, so they’re not new, just redefined, and thus lead to additional confusion among novice web users. Fundamentals of usability state quite clearly that we should call things by their names, not invent new ones to sound cool, or just new. The only people who find value in marketing terms are marketers. The rest of us experience such terms as garbage and filter them out.
Also, since this is not a product produced by a single organization, the use of the numbers leads me to wonder where will the numbering stop?

If you are a web developer or work in the internet field, please, I beg you, don’t use “web 2.0, 3.0, 4.0” etc. when trying to define what you are doing. There are plenty of other techno-bable terms that probably suit your needs just as well that already exist. In my opinion, using such terms just makes you sound uninformed.

I feel a little better now that I got that off my chest…

PS: The link to the other techno-bable terms was the by product of the first Internet bubble. It shouldn’t surprise anyone that these terms continue to show up in today’s “web 2.0” world. Empty language = noise. Let’s try improving the signal-to-noise ratio and minimize the interference. My job is already hard enough.

Following the instructions for “shelving” work in progress using Subversion, I tried to merge my work back onto the trunk, but kept getting the following “error” messages:

“Skipped ‘path/to/file/on/shelf/but/not/in/trunk'”

I read and re-read the Subversion documentation for merging and it made it sound like I was doing something wrong. I finally found a posting with instructions on how to merge in spite of the “Skipped” error message… so I tried it, and it worked (in spite of the misleading messages). The trick really is to ignore the messages.

Note that following the merge, files that are in the source branch and not in the destination branch need to be svn added before they will end up in the destination.

I sincerely hope this post helps others save the two or three hours I lost trying to figure out what I was doing wrong!

Taking a break from writing and posting my own articles, I’d like to direct your attention to a post about the real meaning of Search Engine Optimization (SEO) and a great example of the kind of smarmy people trying their best to steal your money. Would anybody ever really buy anything from the guy in the video? I almost feel sorry for him, almost. The bottom line is, SEO really just means good web design (something I’ve been trying to communicate ever since I arrived here).

This just a quick post about a great article on how to “shelve” a project with local changes so you can work on something else without sending the changes to the trunk.

For those of you wondering what I’m talking about, when we write software, we use systems that allow us to undo changes on hundreds of files at a time. This can be very useful when, in the middle of adding new features, you need to fix a security vulnerability but you don’t want to lose the work you’ve already done.

If you were building a house and had already laid the foundation, built the frame, ran the electric and plumbing, put up the drywall, and then realized you forgot the insullation between the foundation and the frame, it would be a small disaster.

Shelving code allows us to lift the entire house up off the ground so that we can put in the insullation, and then return the house to its rightful place without losing our work.

Techniques such as this allow us to be very productive programmers.