Off-Ramping Bad Traffic to Get to the Good Traffic

sovrnmarketing // June 23, 2014

Originally published on ClickZ on May 16, 2014.

In order to get more “good” high-quality traffic, you must “off-ramp” the bad traffic. Here are four ways to do that.

Defining “good” traffic should be pretty obvious – and of course publishers want the traffic they generate to be legitimate, just as marketers want to purchase legitimate traffic. Good traffic, after all, comes from real human beings visiting real sites, consuming real content, and taking real actions. But what does it mean when we say traffic is not good? It’s an important question for us in the ad-tech industry.

How well we understand the nuances is important if we are to filter the bad from the good. Less bad traffic means better economics for good traffic. In other words, as we expunge the “bad” we get fewer and higher quality “good,” which means more revenue potential per page.

In order to get to more of the good we have to “off-ramp” the bad.

Off-Ramp 1: Websites That Steal Content From the Rightful Owner.

A gateway drug to outright fraudulent behavior is content theft. Bad actors need content to populate their sites and they won’t or don’t create it themselves. It’s far easier to simply steal it. Remember, fraudsters are criminals. Therefore, the first off-ramp is content which is misappropriated. While this isn’t what most people might think of as “traffic fraud,” it still warrants pointing out as definitely not good.

To eliminate these sorts of websites, it takes a combination of human judgment, some manual work, and the leverage of third-party technologies. We need better automated solutions. The ability to find multiple instances of text is a few Web searches away, but plagiarized images, audio files, and videos are much more difficult to find but still solvable thanks to digital fingerprinting. This work isn’t easy and at times can be tedious – but it’s work that needs doing nonetheless.

Off-Ramp 2: Non-Human Actions.

This is the core of what most people refer to when they talk about fraud in online advertising. Simply put, this is any action that masquerades as human activity in order to trick advertisers into buying or valuing an impression. Non-human actions can include: fake pageviews, clicks, mouse-overs, video plays, putting things in shopping carts, filling out Web forms, scrolling down a page, and so on. Non-human activities can come from automated robots living either in server farms or on malware infected PCs and both are designed with some degree of sophistication to mimic real human behaviors.

I also include the act of “spoofing” a cookie. This is the non-human act of creating or manipulating a desired audience profile in order to dupe the decisions a marketer makes about the value of a particular reader. Cookie profiles are a summary of data collected on an individual reader – and therefore represent a key piece of the value judgment marketers make when deciding where, when, and how much to pay for an ad impression; smart bad guys know this.

Discovering and blocking these non-human types of fraud is an ongoing battle between very creative and technically sophisticated people on either side of the equation. Collaboration among white-hat hackers and the leverage of fingerprinting, automated blacklisting, and other proprietary real-time behavior-sensing techniques are how this is most often combatted.

Off-Ramp 3: Low or No Viewability.

A lot of debate centers on the definition of “viewability.” Basically, it’s the likelihood a message will be seen by a real person. Simply put: If there is little-to-no chance of something being seen, then its clearly not “good” traffic. If that low viewability is intentional, then it should be considered stealing.

Measuring viewability is tricky and requires some forensic analysis of the page. It’s not as simple as above or below the fold. Some of this analysis is done post-delivery of the page and message. The reactive nature of this analysis simply means that on the second page load, a clear determination can be made to score the likelihood of a human being able to view it. Several vendors specialize in this sort of analysis but I’d like to see the browser companies (Apple/Safari, Google/Chrome, Microsoft/Explorer, Mozilla/Firefox) step up and put a fork in this issue once and for all. They can “see” top-down as to what gets viewed and this ought to be a signal the browsers make available to the advertising world.

Off-Ramp 4: Obfuscating the Page URL.

A key signal in placing any ad is where or in what context the ad is going to show up. With the rise of audience buying, this might be less important to some advertisers but I would submit that ALL advertisers deserve to know where their ads are landing. When a page URL is either unknown, or worse, intentionally changed or obfuscated, it at a minimum breaches trust and can go as far as being outright fraud. Simply put: A bad actor knows that a known URL is a measure of trust – so why not make sure all your bad URLs are “laundered” so they appear to be good?

Knowing and passing the URL to an advertiser or their agent ought to be a requirement in online advertising. There is no fundamental technical reason that the URL gets dropped from the information transfer. Sometimes it’s hard to determine or a well-meaning ad server technology messes it up inadvertently, but those aren’t valid excuses, just excuses.

Each “off ramp” above is clearly addressable. There is no reason the industry at large can’t apply a set of best practices to each category and stamp out the bad traffic in favor of the good. If we all do this, then the results benefit everyone. Good publishers get better economics. Marketers get higher quality and better performing investments. And bad guys get the shaft.

Cookie	Duration	Description
__hssrc	session	This cookie is set by Hubspot whenever it changes the session cookie. The __hssrc cookie set to 1 indicates that the user has restarted the browser, and if the cookie does not exist, it is assumed to be a new session.
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
csrftoken	1 year	This cookie is associated with Django web development platform for python. Used to help protect the website against Cross-Site Request Forgery attacks
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__cf_bm	30 minutes	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.
__hssc	30 minutes	HubSpot sets this cookie to keep track of sessions and to determine if HubSpot should increment the session number and timestamps in the __hstc cookie.
_auth	1 year	This cookie is set by Pinterest that collects statistical details to track the use of its services.
_pinterest_referrer	past	This cookie is set by Pinterest to track the use of its services.
_pinterest_sess	1 year	This cookie is set by Pinterest that collects statistical details to track the use of its services.
_routing_id	session	This cookie is set by Pinterest that collects statistical details to track the use of its services.
bcookie	2 years	LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser ID.
lang	session	This cookie is used to store the language preferences of a user to serve up content in that stored language the next time user visit the website.
language	session	This cookie is used to store the language preference of the user.
lidc	1 day	LinkedIn sets the lidc cookie to facilitate data center selection.
sp_landing	1 day	The sp_landing is set by Spotify to implement audio content from Spotify on the website and also registers information on user interaction related to the audio content.
sp_t	1 year	The sp_t cookie is set by Spotify to implement audio content from Spotify on the website and also registers information on user interaction related to the audio content.

Cookie	Duration	Description
__hstc	1 year 24 days	This is the main cookie set by Hubspot, for tracking visitors. It contains the domain, initial timestamp (first visit), last timestamp (last visit), current timestamp (this visit), and session number (increments for each subsequent session).
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gat_gtag_UA_52355958_2	1 minute	This cookie is set by Google and is used to distinguish users.
_gcl_au	3 months	Provided by Google Tag Manager to experiment advertisement efficiency of websites using their services.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
hubspotutk	1 year 24 days	This cookie is used by HubSpot to keep track of the visitors to the website. This cookie is passed to Hubspot on form submission and used when deduplicating contacts.
vuid	2 years	Vimeo installs this cookie to collect tracking information by setting a unique ID to embed videos to the website.

Cookie	Duration	Description
_mkto_datetime	1 month	This cookie is set by Marketo.
bscookie	2 years	This cookie is a browser ID cookie set by Linked share Buttons and ad tags.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.

Off-Ramping Bad Traffic to Get to the Good Traffic

Want to learn more?

Share this article