I’ve heard many stories about people making lots of money via Google’s AdWords and AdSense programs. Most people make this income via AdSense Arbitrage: buying AdWords for less than the income generated by the AdSense ads appearing on your site. You pocket the difference.

I don’t believe 1% of what I hear so I decided to research these claims. While researching I stumbled on Spyfu which offers a list of the most expensive keywords being used in AdWords. The astute reader familiar with Google’s AdWords and AdSense programs will immediately recognize the arbitrage opportunity.

Living in a vacation paradise, having a large selection of potential “stock” photos, knowing something about SEO, and having just received a gift certificate for 50€ in AdWords, I decided to try a little experiment. According to Spyfu.com, “hotels”, “travel”, “flights”, “rental”, “vacations”, “holidays” are among the higher paying keywords that one would associate with my collection of stock photos.

The goal is/was to spend (invest) the 50€ on AdWords at rates lower than what other advertisers are paying to have their ads appear on my site, and then hope that users either buy a license for my photographs or click on an advertisement. This is classic “arbitrage” but avoids breaking Google’s AdSense guidelines because the primary goal is to sell licenses and give users some pretty pictures to look at.

I set up a “stock photography” subsite on tedmasterweb.com with lots of pictures of landscapes here in the islands, in other parts of Europe and one of my dad’s farm outside Chicago. Nothing complicated, but fine tuned for SEO, ease of use, and directed at potential stock photo buyers or anyone who likes to look at pretty pictures of places they are going to visit.

At first the ads appearing on the site were all for financial-related things (like masters degrees and stock trading systems). I changed the subdirectory from “stock” to “stock-photography” and the ads, thankfully, changed to reflect more travel and graphic arts related offers. I also added a AdSense placement on the landing page (rather than just on the enlargement pages). These two changes improved my “conversions”. In just a couple of days I had 3 clicks on ads (whereas prior to the 50€ I went months without a single click!). This great article on another AdSense Arbitrage Experiment corroborates my findings (money can be made, but you’re likely going to lose at first).

In the end, as you might have guessed if you have any experience at all in this field, the experiment proved what any reasonable person would assume: My ROI on this so far is a loss of 48€ (plus the several hours I put in to setting everything up, which I could have billed at 70€/hour for any of my clients).

The next step, of course, will be to improve the quality of the textual content on each page so that it targets holiday travelers more directly and gets them to link to these pictures (and ultimately click on the ads too) and to target my keywords better (so that only very interested people click on my site). I’ll post an update at some point so stay tuned!

But before I go…

I also signed up for Google’s AdSense for domains since I had several domains I’d purchased as part of this same experiment (but a more developed version of it). We’ll see if this provides any additional income. Here are the domains in case you’re curious:

1. Stock holiday pictures (without hyphens)
2. Stock holiday photographs
3. Stock photography royalty free
4. Stock vacation pictures (without hyphens)
5. Stock vacation photographs

We’ll see how all this turns out, but the next time someone tells you they’re bringing in 4,000€/month in AdSense income, tell them you want to see their AdSense account before you’ll believe it.

I recently implemented a newsletter subscription form in ASP.NET (2.0) for the CGIAR Secretariat on behalf of CGNET. This is the second project I’ve done in ASP.NET for CGIAR. I’ve never considered myself a skilled ASP developer and like many, picked up my ASP skills based on code I’d seen on the intertubes and via transfer from related languages. In other words, prior to this project, I was a somewhat capable ASP spaghetti coder.

Tired of producing mediocre code and eager to learn what this whole .Net thing was all about, I decided to invest some time learning how to write better ASP and take advantage of as many features of .Net as I could. Armed with two really good books on the subject (Beginning ASP.NET 3.5 in C# and VB and Programming ASP.NET 3.5, 4th Edition), I learned a lot about the .Net revolution and in the end I significantly improved the quality of my code.

DOT.NET borrows heavily from other MVC-like frameworks. I was surprised by the number of similarities between the ASP.NET-way of doing things and the Fusebox-way of doing things. The rest of this post examines some of these similarities and other aspects of working with ASP.NET. This is mostly an examination of ASP.NET from a (PHP) Fusebox developers point of view.

The Project

The CGIAR Secretariat is responsible for www.cgiar.org. The CGIAR Newsroom is one of the primary
sections of their web site. It includes an aggregate RSS news feed of all news items coming out of many of the CGIAR centers. Since many people still are unaware of the advantages of RSS, the CGIAR Secretariat asked if we could set up a system that would allow people to subscribe to the feed via email. Specifically, the system we set up allows people to subscribe and/or unsubscribe via cgiar.org which then automatically sends periodic emails of recently added news items (as they’re added, of course).

The Bottom Line

As an “experienced” web application developer I very much appreciated the ASP.NET-way of doing things. There was nothing in this project that ASP.NET wasn’t able to handle elegantly and more or less efficiently. The project consisted of implementing the following features:

  • A public subscribe and unsubscribe form with CAPTCHA
  • A nightly script that produces a notification email, with alternate views (plain text and html), consisting of news items that had not been sent in prior emails (which implies keeping track of what’s been sent and what hasn’t)
  • A password protected administration interface

The entire project was completed in about 80 hours (including an initial version of the administration interface which was later tabled).

If given the choice of doing the same project in PHP Fusebox, assuming I had the same knowledge and experience with PHP that I had with ASP when I started, would I have chosen PHP Fusebox over ASP.NET? Maybe…

Master Pages

One of the goals of any application framework is to maximize code reuse (and conversely, minimize code duplication). Functions (methods) are one example of how this is accomplished, but when it comes to the presentation level, developers often find they need a more powerful programming model. Both ASP.NET and Fusebox (and many other web application frameworks) provide tools
capable of complete solutions. In ASP.NET, a Master Page is a template for all the pages of a site although its functionality goes beyond a simple templating engine. Master Pages also let you define “behaviors” common to all pages, similar, somewhat, to the Fusebox 3 fbx_Settings file or the Fusebox 4 fusebox.init file.

A powerful templating engine will frequently go beyond “one layer” and allow the developer to subdivide sections of the Master to be handled by other parts of the application. ASP.NET offers this functionality directly and at least one project I’ve worked on in Fusebox had the same functionality. I found a few opportunities to use this feature on this project.

Fusebox does not offer a templating engine out of the box, but you can easily create much of this capability in Fusebox 4 (and to some degree in Fusebox 3) using Content Variables and a “layout” circuit. Most projects I work on offer some such circuit.

In the end I got a lot of milage out of Master Pages for this project and hope to be able to use them in future projects for CGIAR.

Postback

By default, every ASP.NET page contains a form that executes on the server. The value of the “action” attribute for the form is the current file. Microsoft has termed this approach “postback” because you post the form back to the same document that created it. In some respects this is similar to the Fusebox implementation of the Front Controller design pattern, where every request is for the same server-side script (e.g.: /index.php) followed by a Query String containing directions on what files, functions, and procedures to execute.

ASP.NET offers the developer server-side elements knows as panels. Panels are “controls” (elements?) that contain other controls. By setting the visible property of a panel you can control whether or not its contents are visible on the page. For the CGIAR project mentioned above, I used this technique to display either the subscription form or the “thank you” message following a successful subscription. I suppose that, if each ASP.NET page is the equivalent of a Fusebox circuit, that each panel could be the equivalent of a Fuseaction. You would simply set the display for all for all of them to false (except the default fuseaction 😉 ) and then display them as needed. Coming from Fusebox, I found the concept very easy to grasp.

Code-behind

Microsoft made an attempt to separate application logic from presentation with DOT.NET. In my opinion, they succeeded.

In order to minimize the amount of raw code found in HTML, .Net provides something known as “code-behind” pages, which are essentially includes with the same name as the file they are attached to. The idea is that your application code goes in the code-behind page and if you need to modify the presentation (the HTML) from within the application code, you do so by referencing elements in your HTML page via their ID attribute (this is an oversimplification but summarizes the approach).

Fusebox, on the other hand, tries to separate code by prefixing file names with one of dsp_, qry_, act_ (and sometimes lyt_).

  • qry_ files contain datasource queries and (usually) return some sort of object or array echoed to the browser by a dsp_ file.
  • act_ files are for those instances in which you need process data prior to executing a query or echoing to the browser.
  • lyt_ files are, in essence, the same as Master Pages in ASP.NET.

In ASP.NET the “HTML” files contain a LOT of .Net namespaced elements. This means the files can be completely valid XHTML (with the single, notable exception of the Processing Instructions found at the top of each page). The benefit is that these files are suddenly very portable and can be consumed by any system capable of reading XML.

If you wanted to reproduce this ASP.NET functionality in Fusebox, you would need to write a plug-in that parses the dsp_ files looking for <fbx: elements and responds accordingly. You could put all of your code in act_ files which would essentially turn them into code-behind files. Now there’s a potential open source time sucker!

Data Controls

As one would expect, the data controls are very complete, but figuring out how to do something like
nesting GridViews was not obvious and were it not for a Nested GridView walk-through article on MSDN, I never would have been able to figure out how to do it. Furthermore, I’m not convinced executing sql on EVERY ROW of a record set is really a good idea (the authors of the walk-through admit this is not the best approach, but only as it concerns caching…).

Much of the CGIAR Newsroom revolves around their RSS feed (which is a compilation of feeds from all of the CGIAR Centers). ASP.NET 2.0 and above include controls for using XML as a data source and thus facilitating the display of XML data in a web page.

Unfortunately, in version 2.0 of .Net (and possibly higher), using XML as a data source only allows you to display the data. It does not allow you to use the built-in INSERT, UPDATE, and DELETE
features of the GridView control. There are work-arounds for implementing this missing functionality but I have to wonder if you save any time hacking in the functionality vs. building the entire administration interface the “old” way (which can be done pretty quickly using XSLT). By my estimates, it’s a draw, at best.

ASP.NET Advantages

For my needs, web controls, validation in particular, significantly reduce web application development time. I simply cannot express how much I like the validation controls and WISH PHP had something similar!

IntelliSense greatly speeds up coding – Microsoft offers several free (web) application development tools that work quite well and are more than adequate for the projects I usually work on and many include IntelliSense.

I don’t think I could have completed this project as quickly without IntelliSense.

ASP.NET Ambiguities and Disadvantages

Since no language is free of sin, here is my list of things that got the best of me while working on this project:

  • web.config cannot be part of code repository since it is machine dependent. If application configuration options are so different between deployment environments, then maybe the author of the application should consider using a different development environment. I would prefer to have the application configuration code right in the application tucked into code that “sniffs” the environment and configures accordingly. This makes for MUCH more portable code.
  • .Net developers frequently publish their source code, but it usually needs to be compiled so unless you’re into that or have the time to learn how to do it (and do it right), you won’t find the same kind of huge Open Source community of code that exists for PHP
  • ASP.NET 2.0 includes some Authentication and Authorization controls, but like most stuff like this, you have to do things the ASP.NET-way or you can’t use these controls. In other words, there is no (apparent) way to retrofit these controls onto existing authentication mechanisms. In the end this is probably a good thing since most existing authentication methods are not very secure, but in the real world most clients simply are not willing to put money into changing existing systems unless you can clearly demonstrate they are broken.
  • One of the main complaints of Fusebox 4 was the XML files. Rather than being used simply for configuration, you could easily add business logic to them. More than one programmer has asked herself: “Why bother with XML to represent classes? Why not just use classes directly?” I must say, when programming in ASP.NET, I often feel like I’m simply setting application configuration parameters which, for anything but the most basic interactions, makes programming harder (and possibly more time-consuming) rather than easier (and faster) since you have to have a clear understanding of what state the application is in at the exact point where your code appears. This can be harder than it seems.

In the end, if I had to do it all over again (and had the choice), I would probably stick with PHP Fusebox but I’m grateful I had the opportunity to improve my knowledge ASP.NET and I wish the CGIAR Secretariat the best with their new system.

Additional Reading and Links

Convert ASP.NET applications into real MVC frameworks

Fusebox Basics

Comparison of Web Application Frameworks

I’ve been experimenting with a couple of tools for creating cross-platform web design. I’m quite happy with the results (which will be used on production sites in the coming weeks). I’m no longer plagued by the woes of differing font sizes, incorrect positioning, CSS hacks, etc. that makes a web developers life misery.

I am using the 960 grid system for managing layout in combination with a blog post on how to get cross browser compatibility every time, a simple list of DOs and DON’Ts when writing the HTML and CSS for the first time.

The combination has been a major time-saver (and FOR ONCE I can have a multi-column design WITHOUT using tables)! I can hardly recommend these two links enough. The only remaining doubts I have are whether to use EMs or pixels for padding and margin sizes. My brief experimentation suggests avoiding setting such values altogether but if they must be set, use pixels.

I’d love to hear about other’s experiences with these tools (and other, similar tools).

The thing I’ve most disliked about the term “Web 2.0” is that most people use it to define something old as if it were something new. For example, some people have used the term to define myspace.com as a Web 2.0 application. However, they should remember that Geocities offered many of the same capabilities as myspace.com, but has been around since long before the term Web 2.0 was coined (since late 1995, to be exact).

Finally Tom Scott has done a great job underlining the essence of Web 2.0 applications. He describes exactly that part of Web 2.0 applications that is actually new. He also underlines the importance of good URL design and points out that it is even more important than graphic design (and I fully agree), but even good URL design isn’t exactly new. In fact, it’s been the mantra of many since the very beginning of the web, and for exactly the same reasons.

In the end, perhaps the only thing that differentiates these new web applications from pre-Web 2.0 web applications is the effort they put into making their data accessible to other systems. And if this is the case, is it really worth coining a new term to describe it? If so, why aren’t we just calling them what they are: web applications based on open standards? And finally, I don’t think the fact that most emerging web applications are embracing open standards should come as a surprise to anyone since it is a great way to grow your user base.

I’ve finally updated my PHP BBEdit Clipping Set (formerly known as a glossary). The new clipping set:

  • Continues to contain around 6,000 function definitions (whatever is in the official PHP documentation)
  • Is based on a recent version of the official documentation
  • Allows tabbing between function parameters (very handy…)
  • Provides additional information when inserting a function (the return type, for example)
  • Optionally includes predefined constants for each function

As always, there are links to instructions for creating your own set (be sure to download the Extras to get the source XSLT stylesheets) and donations are greatly appreciated.

Download PHP BBEdit Clipping Set

Periodically, Russ over at maxdesign.com.au publishes his list of “links for light reading” (an ironic name considering the actual content is usually quite dense). I frequently find several very interesting articles on that list. Not having the time to compile a similar list myself, I looked for a quick and dirty alternative.

Since I use Google Reader I’m in luck. Reader offers the ability to make items “public”. It also allows you to “star” items that then end up on your starred item list. And it offers the ability to merge the two, so, with this blog entry, I formally announce the publication of my starred items list available for your consumption.

(originally posted to the BBEdit-Talk list, but posting here too since the answer might help others)

I’m looking for a regex pattern that will find quoted strings (double quotes) but skip (double-)quoted strings containing any of the following characters: $, ‘, “, (dollar sign, single quote, double quote, backslash)

At first I tried “[^$'”\]+?” but it was matching the end of one quoted string and the beginning of the next, so I’m clearly missing something.

Regexes in Depth: Advanced Quoted String Matching was helpful, but didn’t explain how to negate strings containing the characters above.

Strings that should fail to match:

// contains quotes
$str = "`zcol ACOL` NUMBER(32,2) DEFAULT 'The "cow"
(and Jim''s dog) jumps over the moon' PRIMARY,
INTI INT AUTO DEFAULT 0, zcol2"afs ds";

// contains dollar signs, backslashes and single quotes
ADOConnection::outp( "
-- $_SESSION['AVAR']={$_SESSION['AVAR']}",false);

// contain single quotes
if (strncmp($val,"'",1) != 0 && substr($val,strlen($val)-1,1) != "'") {

Strings that should successfully match:

$myvar = "this is my quoeted ".$and_another_var." and another string";

Also, quoted strings should not be preceded with a backslash.

I’ve read and reread the BBEdit docs (which are great) but I’ve been unable to come up with a method that passes all of these tests.

I never had any idea this could be such a complicated problem. Does anyone see what I’m missing?

Update

Matching negative character classes is prone to difficulties because it’s hard to manage what comes before and after the class. That’s why I ended up using the following, which worked more or less well for me and avoided matching properly quoted strings inside HTML.

(?s)(?<!name=|action=|align=|valign=|width=|height=
|nowrap=|scope=|class=|id=|style=|type=|value=|method=|border=
|cellspacing=|cellpadding=|colspan=|size=|maxlength=|for=|label=
|rows=|cols=|wrap=|language=|href=|version=|fuse=|charset=|src=
|alt=|title=|xmlns=|http-equiv=|rel=|content=|rowspan=|checked=
|accept=|face=)(?<!')(?<!\)(?<!?>)
"((?!.|,|, | ,| , |. | . |:| :|: | : )[[:alnum:]
-_.,:%@<>?()*/]*?(?<!\))"

Update 2

Give me a break! Here’s the solution to this problem: matching quoted strings.

We’re launching a new small business server product in the coming weeks, ideal for small businesses that need automated backups (and restores), shared internet, shared files, and one or two other goodies. The server is only available for rent starting at 200€/month (including maintenance). This product is, to some degree, the culmination of about 3 years of running our own, small, hosting environment which, as far as we can tell, has not (yet) been compromised. I doubt we could keep a determined hacker from getting in but we’ve so far been able to keep the script kiddies at bay. Here are some of the things we’ve learned along the way.

Use a firewall, even a software-based firewall such as the Endian Firewall. You’ll have to work some magic internally if you want to use host-based routing, but more complication just makes hacking more complicated and unless you have a really juicy target, most hackers will go elsewhere (we presume).
Install and configure mod_security (claims to protect against xss and many other things automagically). We haven’t been able to verify its functionality, but just knowing there’s another layer there makes us feel better 😀

PHP

  • turn off fopen wrappers
  • turn off register globals
  • turn off expose_php
  • disable unused functions and classes
  • install only the extensions you’re sure you’ll need

Disable other server side scripting engines and CGI (assuming you are running PHP as an apache module)
Turn off other unused services

  • email
  • telnet
  • ftp
  • ssh
  • etc.

Uninstall unneeded software (such as the whole Gnome interface and anything that requires runlevel 5 to function – this is a server after all). You might even consider starting building the server with a base in stall of Debian or Ubuntu Server (both of which fit in 64 MB of memory).
Log everything and increase the log history (double-edged sword).

Don’t expose what web server you are running (or PHP or any other server-side technologies) in HTTP responses. In fact, if possible, alter the server signature (and fingerprint) to something unrecognizable or too generic to be of much help.

I’m sure there are more tips I’m forgetting, but these should help you get started. I’d love to hear others experiences and tips if you care to share…

Roger Johansson over at 456 Berea Street, reflecting on a series of articles by John Allsopp regarding HTML semantics, asks the question: “Should there be another way of extending and improving the semantics of HTML without requiring the specification to be updated?”

Personally, I think the issue revolves around the misuse of HTML to mark up something other than research papers.

It is my understanding that HTML is a subset of SGML, a markup language used to mark up research papers for mass reproduction on offset printers. As such, the vocabulary (the tags) in HTML reflect the type of data being marked up. Consequently, when HTML is used to mark up documents that are not academic in nature (are not research papers), authors are left cobbling together solutions to retain the semantic value, but that rarely works. For example, if you want to mark up a mathematic equation, you’ll need the MathML specification precisely because HTML doesn’t have the vocabulary necessary for describing the content.

I find it a little ironic that Tim Berners-Lee has basically turned everyone into an academic in some sense, by enabling them to do massive research and post their findings. However, current technology limits us to “browsing” research papers, even though we’ve creatively found ways to publish much, much more than that.

I think the world is missing a browser that is able to render a variety of markup languages (vocabularies), including HTML, MathML, XHTML, XHTML2, XForms, SMIL, and others (although the last 2 are not technically markup languages). I can imagine a world in which marketers define their own markup specification for sharing data (a problem I think microformats are trying to solve) safely. In fact, markup languages can be defined for nearly any field. The problem is, we don’t have web browsers capable of rendering the data in the source documents in any meaningful fashion because no formatting information is associated with any of the elements of these foreign markup languages. In fact, I find it hard to imagine what a marketing database or recipe list would look like if not some kind of document.

So, in conclusion, I’m not sure if I’ve made my point, but basically I think any semantic improvements in HTML will come from focusing on the domain it was originally intended for (academia) than by trying to extend it to other domains that have little or nothing to do with writing research papers.