Tracker Tracker


DMI App Tracker Tracker is a tool to detect in a set of URLs predefined fingerprints of known web tracking technologies.
 

Instructions

DMI App Tracker Tracker is built on top of Ghostery, a privacy browser extension for the web: https://www.ghostery.com/.

Instructions:
  1. First, create a seed list of one or multiple URLs pointing to specific web domains or to individual web pages on these domains (or a combination of both). Example URLs: domain: https://www.huffingtonpost.com/; individual page: https://www.huffingtonpost.com/nate-hanson/how-to-stop-facebook-from_b_8160400.html.
  2. Second, input the seed list of URLs (one per line) and run the tool to detect predefined fingerprints of tracking technologies in the page’s source code. Don’t forget to name your result properly (e.g., including your name, a date, and a keyword) for retrieval at a later time. A process log is shown while the tool is running.
  3. Third, the tool analyses the source code of each page included to detect predefined fingerprints of known web tracking technologies using an extensive library of more than 2,000 unique fingerprints. Please note: this tool does not click on any buttons, accepts no cookies, and cannot move beyond logins, cookies, or paywalls. It simulates a real browser instead using a scripting language called PhantomJS, which might cause certain content to not load properly (but in most cases, outputs should be similar).
  4. Finally, after successful completion results can be downloaded via the ‘Output’ tab. It outputs the following standard file formats: tabular (.HTML, .CSV) and networks (.GEXF). Eight native tracking technology categories defined by Ghostery are included in these outputs (https://ghostery.zendesk.com/hc/en-us/articles/115000740394-What-are-the-new-tracker-categories-). Please note: This tool stores previous jobs under the ‘Past Jobs’ tab, with the job name provided by the researcher.

Sample project

Analyze a set of URLs to see whether they contain widgets, analytics or another type of trackers:

1. Insert a list of URLs (max 100)

2. Choose only to retrieve the trackers for the inserted URLs or also to include deeper pages. This is for instance useful when detecting social plugins on news sites - they are commonly present on deeper pages and not on the frontpage. Make sure to specify the max. number of subpages (e.g. 2) that should be included because it will slow the tool down.

3. Output is a CSV or a Gephi file

4. Optional: visualize the connections between the websites and the trackers in Gephi

But what if my source set has a 1000 URLs and I want to visualize them all in Gephi?

We like medium-sized data too! The tool tends to slow down or break down when using a large number of URLs. We recommend inputting 100 URLs at a time and then combining the Gephi files.

How to combine the Gephi files?

1. Open a first .gexf file

2. Go to File > Open > select a second .gexf file > open

3. In the Import report dialogue select Append Graph, this will merge the two .gexf files (graphs)

4. Repeat step 2-4 to merge all your .gexf files into a single graph.

Other projects using this tool

Workshop Trackingthe Trackers
Track the Trackers Workshop by Anne Helmond Alexei Miagkov (Ghostery) for the Digital Methods Summerschool 2013 by Anne Helmond and Carolin Gerlitz for the Digi...

Topic revision: r6 - 17 Nov 2014, ErikBorra
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback