We’re excited to announce our inaugural study on how people discover and engage with content on the web!

In the first quarter of 2011, for the first time ever, we decided to analyze our expansive data and share it with the industry to help inform conversations around content discovery. And now we’re ready to share it with everyone.

To do this, we looked at traffic patterns  from 100 million sessions across more than 100 premium publishers that are currently using our platform to see how readers are accessing content, where they’re finding it and how they’re engaging with that content. We’ve compiled this data into our inaugural report, and our hope is to use this as a benchmark against future quarterly trend analysis.

Let’s set the stage: Methodology

In order to set a baseline for future reports, we looked at the referral data into publisher articles to see which sources are currently driving the most traffic, as well as the level of engagement and quality of the traffic coming from these sources. For this study, we pulled a sample data set of 100 million sessions across more than 100 of Outbrain’s top publishers. A session is defined as a series of page views within a publisher site with no more than 30 minutes between one page view and the next.

Since Outbrain tracks traffic to content pages (articles and videos), we saw that approximately 33% of the overall sessions into such pages start from an external site. The remaining 67% of content sessions begin internally, from type-in traffic, bookmarks, clicks from the publisher’s homepage, other in-site links or are simply from unknown sources. For the purposes of this study, we evaluate only on the third of sessions that begin offsite in order to focus on how people discover content from outside sources.

Key Findings

  • While search still reigns supreme in terms of directing traffic to content pages (41% of external referrers), social is gaining share at 11%.
  • Of the six content verticals examined, stories in the news, entertainment and lifestyle categories are the most likely to receive traffic from social sources.
  • Traffic coming from social media sources has the highest tendency to bounce.
  • Readers who go from one content site to another (i.e. USA Today to The Daily Beast) are most likely to be engaged in what they’re reading, presumably because they are already in content consumption mode.
  • Facebook delivers a more diverse audience than Twitter.

Top Traffic Sources

According to our data, the following sources were the top referrers of traffic during Q1, by number of sessions:*

Top 20 Sources

For the purposes of our study, we are focusing on known external referral traffic to content pages, which accounts for 33% of our sample set (direct, in-site and unknown traffic sources account for the other 67%).

Here is a breakdown of referring sources in terms of the type of referrer represented:

Excluding direct and in-site traffic, an adjusted breakdown of traffic share is illustrated by the following pie chart:

Currently, search methods (including Google, AOL Search, Bing, Yahoo and Ask) send the largest slice of referral traffic to content. Links from publisher sites make up 31% of referral traffic to content pages (this includes manual partner syndication, linkswaps, Outbrain recommendations of content to content, etc.), portal homepages (AOL.com, Yahoo.com, MSN.com) account for 17% of traffic, and finally, social media sites (Facebook, Twitter, StumbleUpon, Fark.com, reddit, Digg) send 11% of traffic to content pages.

In subsequent quarters, we plan to pay special attention to trends in this category in order to evaluate how people’s online behavior may be altering content consumption patterns.

What type of content do people share?

It’s no secret that people are spending an ever-increasing amount of time on social sites like Facebook and Twitter. One of the by-products of this shift is that these same people are now relying on their networks of friends and peers to alert them to interesting news and content. We thought it was worth looking at how this breaks down among the different content verticals. News stories, at 42%, were the most likely to receive traffic from social sites, followed by entertainment stories at 30% and lifestyle stories at 13%.

 

Reader Engagement By Category

In addition to looking at which sources deliver the most traffic, we also analyzed how those sources stack up against each other in terms of delivering the most engaged audiences, as measured by metrics such as average page views per session and bounce rates.**

 

Content sites have the lowest bounce rates, presumably because they are targeting an audience that is already engaged and in content consumption mode. Traffic coming from social media sources, on the other hand, has the highest tendency to bounce.

 

Delving into the topic of engagement a bit further, we looked at which sources are most likely to refer a hyper-engaged reader, which we defined as a reader who views five or more pages per session.

 

Readers referred from publisher sites — from content to content — are likely to consume more pages than those referred from other sources. Readers referred from search queries consume slightly more pages than average, though Yahoo, whose sessions tend to be twice as likely to be hyper-engaged than the average referrer, skews this number.

 

Comparing Facebook and Twitter as Referral Sources

Given the popularity of both Facebook and Twitter, we thought it was worth comparing their relative traffic quality to see what differences exist. Surprisingly, the two sites drive similarly engaged audiences in terms of page views per session, bounce rates and hyper-engaged reader sessions. The one key difference is in their relative reach, which we define as the number of unique visitors per 1,000 sessions. Specifically, we found about 72% of sessions originating from Facebook were from a unique visitor, versus only 52% in the case of Twitter, suggesting that Twitter’s audience is more likely to be made up of repeat visitors.

 

Summary

It’s an exciting time in the content discovery space as larger trends in web usage play out in the ways which readers access and engage with content. Though traditional methods like search still reign supreme, we’re keeping a close eye on new trends such as social sharing and the increasing openness of content sites to link freely to one another. From a publisher standpoint, it’s particularly important to recognize and understand these shifts in order to identify the best way to get your content in front of readers. In future reports, we hope to highlight changing trends and track them against our baseline data.

 

*Results may be skewed towards news and entertainment sites, as they constitute more than 50% of the publishers working with Outbrain.

**Average page views per session is defined as the average number of pages visited by a user during his or her session. Bounce rate is defined as the percentage of sessions that only lasted one page view.


>>>> Download a high-res PDF of the report: Content Discovery and Engagement Report, Q1 2011

At Outbrain we strive to utilize open source projects wherever possible — we’d like to thank The Apache Software Foundation and Cloudera for their brilliant work on Hadoop and Hive and for helping us handle our “big data” needs, without which our Q1 report might not have surfaced until sometime in Q3.

 

Have questions? We’d love to hear ‘em!