Search /
Docfinder:
Advanced search  |  Help  |  Site map
RESEARCH CENTERS
SITE RESOURCES
Click for Layer 8! No, really, click NOW!
Networking for Small Business
TODAY'S NEWS
Apple tops the $100B+ tech club
How to get the IRS' attention: Forge nearly $8 million in tax returns, steal identities
Microsoft details Windows 8 for ARM devices
Blogger exposes major Google Wallet security flaw
Web app lets enterprise set security, sharing for Google Apps users
Cloudscaling to offer OpenStack private cloud platform
Valentine's Day Patch Tuesday: Microsoft to issue 9 patches, 4 critical
Mobile World Congress sneak peek: Quad-core smartphones, Ice Cream Sandwich & more
Microsoft details 'Windows on ARM' program
March debut of 'iPad 3' a sure bet, says analyst
Resume Makeover: How an Information Security Professional Can Target CSO Jobs
FBI unbolts Steve Jobs 1991 investigation file
Cisco boosted profit, sales in Q2 while cutting costs
Macs take on the enterprise
/

Curing Web insanity

Gearhead archive

There is little doubt that the Web has gone insane. Where once you could meander happily around the 'Net loading pages, you now find pages opening over the page you wanted (pop-ups), under the page you wanted (pop-unders) or opening when you leave the page (more pop-ups). There are also pages that force a refresh, pages with nosy JavaScript, pages with acres of blinking text and countless other pages that just generally tick you off.

So you have two choices - fix the problem or stop using the Web.

The latter being impractical, we'll go for fixing the problem and the solution is a utility - and a free utility at that! - called Proxomitron, the creation of Scott Lemmon. You can find this fantastic tool (do you think we're a little excited?) at the wonderfully named spywaresucks.org

Proxomitron is a simple idea: It's a proxy server that can parse Web pages and match patterns in the text of the retrieved HTML code to look for code that will do something you don't like.

Here's how it works: When a Web browser requests a URL from the proxy (which runs on your PC or any machine you please), the proxy retrieves the URL contents and attempts to match the text in the contents with rules defined in Proxomitron.

When a pattern match is found (say for a pop-under ad) Proxomitron changes the code into a comment that doesn't get displayed by the browser. Optionally, new code can be added based on the original code.

Before we get into how the tool is configured and how it works with your browser, we should first cover how it matches text patterns.

The tool has its own text-matching language that is a lot like regular expressions (see this column) but with some additional wrinkles. The rules are in several parts, the most important of which are the matching expression and the replacement text. For example, if the matching expression is:

\1 <body> \2 </body> \3

And the replacement text is:

\1 <body><b>All gone!</b> </body> \3

Then the page contents as defined between the body tags would be replaced with "All gone!" in bold text. The specifications "\1" etc. are variables that store the text that follows the start of the input text or the last matched text to the next matched string.

Thus, if in our last example the requested Web page read:

<html>
<head>title>My page</title></head>
<body>Howdy!</body>
</html>

The output will be:

<html>
<head><title>My page</title></head>
<body><b>All gone!</b></body> </html>

The \1 variable held the text "<html><head><title>My page</title></head>", the \2 held "<body>Howdy!</body>", and so on. Actually, this is a very primitive rule because the <body> tag could contain an attribute such as <bodybackground= "mybg.gif">, which would cause the rule to fail. We can solve that by doing this:

\1 <body (*|)> \2 </body> \3

Here the string "(*|)" in the matching expression means that any sequence of characters (that's the "*") or (that's what the "|" character means) no characters can precede the closing ">". You can't use "*" by itself to match any character because the rule will fail - obviously not what we want.

So consider a page that contains the dreaded blinking HTML text (to be distinguished from animated GIFs and DHTML tricks that do the same thing). Under Proxomitron, the following rule will find both the opening and closing blink tags (note that a rule will be applied repeatedly to the incoming text):

<\1blink>

Proxomitron's Replacement Text would be:

<\1b>

Thus, "<blink>Isn't this annoying?</blink>" would become "<b>Isn't this annoying?</b>".

Next week, we'll delve further into the depths of Proxomitron. Match your text at gearhead@gibbs.com.

RELATED LINKS

Comments and suggestions to gh@gibbs.com.

Gibbs Forum
The place to discuss Gibbs's columns.

Check out this week's edition of

Backspin for more musings from Gibbs.


NWFusion offers more than 40 FREE technology-specific email newsletters in key network technology areas such as NSM, VPNs, Convergence, Security and more.
Click here to sign up!
New Event - WANs: Optimizing Your Network Now.
Hear from the experts about the innovations that are already starting to shake up the WAN world. Free Network World Technology Tour and Expo in Dallas, San Francisco, Washington DC, and New York.
Attend FREE
Your FREE Network World subscription will also include breaking news and information on wireless, storage, infrastructure, carriers and SPs, enterprise applications, videoconferencing, plus product reviews, technology insiders, management surveys and technology updates - GET IT NOW.