I’m sure that many of us who are reading now this article have struggled to track actions inside PDF files. PDFs are broadly used to present informational content in a widely shared document format. Although they are a proprietary Adobe format, not much effort has been made to provide a fully integrated and comprehensive client-side tracking solution for SiteCatalyst.


Specifically, one of the main problems occurs when a full working JavaScript functionality is needed inside the PDF files. As analysts, it is indeed important for us to understand how to track user journeys on PDF documents. Moreover, PDFs now may contain hyperlinks to other pages and even to other PDF files. Every time a PDF-to-PDF link is clicked, the second document cannot be computed into its “File Downloads” count.

Under the Adobe SiteCatalyst implementation analyst lenses, I tried to focus this article on those cases when no JavaScript can be written on a page.

Surely, one of the greatest advantages in these cases is to have a tailored tracking system that allows us to track what we firstly classified as “untraceable” and made us surrender.

What to do in these cases? How to design a technology able to push some relevant information into SiteCatalyst servers and without JavaScript coming to help?


Now it’s going to be technical, so bear with us for the following steps.

When a PDF is linked to another PDF via a CTA, no tags can fire natively on the first document. The solution proposed here is based on the creation of a dummy landing page, which hosts an XML code. This page contains a Custom XML Tracker that pushes user defined variable values into the Adobe Marketing Cloud and, at the same time, performs a redirect to the desired URL (second PDF to be opened). The solution is fully supported and documented by Adobe under the name of “Data Insertion API” (see Adobe references for full details). It entails two main methods for actioning the strategy: HTTP POST and HTTP GET.

In our case, we will focus more on the HTTP GET method.

This method submits data to the Data Insertion URL in a query-string format that supports shortened variable names. The values populating the props and evars on the dummy page XML code are taken directly from the query string parameters on the linked URL. In this way, all that is defined by the implementation analyst after the “?” is pushed directly into Omniture servers via the custom XML tracker. Instead of using JavaScript to transmit data to the Adobe servers, server-side data collections happens solely in web browser requests and web server responses.

XML Tracker URL

http://mycustomtracker.php?PDFtoPDF=[Custom defined Value – Displayed in Omniture]&url=[your linked PDF URL – Where to redirect]&ref=[your source PDF filename – Displayed in the referrer report]

-mycustomtracker.php Is the page on your server where the XML custom tracker is implemented. It pushes the parameters the query string directly into user defined props in Omniture and actions the correct redirect.

-PDFtoPDF: Is the custom value defined by you. it identifies the connection to the second PDF file and stores it’s value in a prop. The prop should be previously set-up for reporting on this new download type.

-URL: is the destination URL of the linked PDF.

-ref: Is the source PDF. It will recored the source file under referrers or any other custom report. http://mycustomtracker.php?PDFtoPDF=PDF_B_Download&url=www.yoursite.com/PDF_B.pdf&ref=PDF_Bfrom PDF_A


-PDFtoPDF: PDF_B_download will be your record in your custom defined download report

-url: www.yoursite.com/PDF_B.pdf is the destination PDF from the CTA on PDF A

-ref: PDF_B from PDF_A will be the source document in a customer referrer report. N-to-N relations can be pushed.

Why this solution is good for you

When multiple elements with no JavaScript are linked together, it was previously impossible to report on these cross-referenced activities in Omniture. The lack of JavaScript snippets, inside PDFs for example, was not making possible to track whether a new file was downloaded from a source PDF and, therefore, this new download was never counted in Omniture.

This solution is finally increasing the reliability of data collection process and the consequential reporting, avoiding to under report what previously was missing and untraceable from a non-JavaScript element.

Another clear advantage of the Custom XML Tracker is the possibility to fully customise which variable should be pushed into Omniture. Any user defined prop or evar are connected to the user activity within the document. Variables such as s.campaign can now be recorded as a singular instance under its related report. This has also the side beneficial effect to avoid traffic inflation: no pageview or visit is passed through s.pagename since this one can be excluded. If the document contains hyperlinks to a website, it can also be viewed as a source of traffic for the linked page.

The solution can also be applied to track email campaign and enewsletter, whether these are containing a campaign parameter and are linked to any non-JavaScript document.

The limitations of this solution

You must have control over the servers from which you want to collect data and that contains the Custom Tracker. If you use a Content Delivery Network (CDN) to deliver web pages, the server-side data collection described here cannot collect data for those pages. Also, the server-side data collection alone cannot provide cross-domain tracking of site visitors.

Not a lot can still be drawn out about user engagement on the PDF itself. If in one hand it is possible to understand how many clicks a CTA receives, on the other all metrics (such us pages per visit or average time spent on the document) cannot still be tracked with this method.


Leave a Reply