Digital Methods Initiative Tool Archive pertaining to Data Collection

This is an archive of tools developed by the DMI & affiliates in the past years. Note that not all tools are still available and while some are actively maintained, others are not.


4CAT: Capture and Analysis Toolkit
Launch toolInstructions & Scenarios of Use
Create datasets from a variety of web platforms - including Reddit, Telegram, 4chan and others - and analyze them.
 
Amazon Related Product Graph
Launch toolInstructions & Scenarios of Use
This command-line Python script allows you to enter a (set of) ASIN(s) and crawl its recommendations up til a user-specified depth.
 
App Tracker explorer
Launch toolInstructions & Scenarios of Use
DMI App Tracker Tracker is a tool to detect in a set of APK files predefined fingerprints of known tracking technologies or other software libraries.
 
Brightbeam
Launch toolInstructions & Scenarios of Use
Interactively capture and inspect third-party trackers encountered while browsing.
 
Censorship Explorer
Launch toolInstructions & Scenarios of Use
Check whether a URL is censored in a particular country by using proxies located around the world.
 
Deduplicate
Launch toolInstructions & Scenarios of Use
Replicates the tags in a tag cloud by their value
 
Expand Tiny Urls
Launch toolInstructions & Scenarios of Use
Expands URLs that have been shortened by tools like tinyurl.com or bit.ly. Often used in social media such as Twitter or Facebook.
 
Github organizations meta-data lookup
Launch toolInstructions & Scenarios of Use
Extract the meta-data of organizations on Github
 
Github repositories meta-data lookup
Launch toolInstructions & Scenarios of Use
Extract the meta-data of Github repositories
 
Github repositories scraper
Launch toolInstructions & Scenarios of Use
Scrape Github for forks of projects
 
Github scraper
Launch toolInstructions & Scenarios of Use
Scrape Github for user interactions and user to repository relations
 
Github user meta-data lookup
Launch toolInstructions & Scenarios of Use
Extract meta-data about users on Github
 
GithubContributorsScraper
Launch toolInstructions & Scenarios of Use
Find out which users contributed source code to Github repositories
 
Google Autocomplete
Launch toolInstructions & Scenarios of Use
Retrieves autocomplete suggestions from Google
 
Google Play Store Scraper
Launch toolInstructions & Scenarios of Use
Google Play Store Scraper is a simple tool to extract the details of individual apps, collect their related apps, retrieve app permissions, and retrieve a list of apps for a given keyword.
 
Googlescraper (Search Engine Scraper)
Launch toolInstructions & Scenarios of Use
Batch queries Google. Query the resonance of a particular term, or a series of terms, in a set of Websites.
 
Harvester
Launch toolInstructions & Scenarios of Use
Extract URLs from text, source code or search engine results. Produces a clean list of URLs.
 
Image Scraper
Launch toolInstructions & Scenarios of Use
Scrape images from a single page.
 
Internet Archive Wayback Machine Link Ripper
Launch toolInstructions & Scenarios of Use
Scrapes links from the Wayback Machine
 
Internet Archive Wayback Machine Network Per Year
Launch toolInstructions & Scenarios of Use
Enter a set of URLs and the archived versions closest to 1 July for a specific year are retrieved. Thereafter links are extracted and a network file is output.
 
Issue Dramaturg
Launch toolInstructions & Scenarios of Use
Enter up to 3 URLs as well as a key word. The Issuedramaturg queries Google for the key word, and shows the Pageranks of the URLs over time. The output is a graph of the Pagerank of the URLs...
 
Issue Geographer
Launch toolInstructions & Scenarios of Use
Geo-locates the organizations on an Issue Crawler map, using whois information, and visualizes the organizations' registered locations on a geographical map.
 
Issuecrawler
Launch toolInstructions & Scenarios of Use
Enter URLs and the Issue Crawler performs co-link analysis in one, two or three iterations, and outputs a cluster graph. The Issue Crawler also has modules for snowball crawling (up to 3 deg...
 
Link Ripper
Launch toolInstructions & Scenarios of Use
Capture all internal links and/or outlinks from a page.
 
Lippmannian Device
Launch toolInstructions & Scenarios of Use
The Lippmannian device is named Walter Lippmann, and provides a coarse means of showing actor partisanship.
 
Netvizz
Launch toolInstructions & Scenarios of Use
Extracts various datasets from Facebook.
 
News Agencies Scraper
Launch toolInstructions & Scenarios of Use
Scrape various news agencies for particular keywords and extract titles, images, dates and full text.
 
Rip Sentences
Launch toolInstructions & Scenarios of Use
Rip text from a specified page and force line breaks between sentences.
 
Robots.txt Discovery
Launch toolInstructions & Scenarios of Use
Display a site's robot exclusion policy.
 
Screenshot generator
Launch toolInstructions & Scenarios of Use
Produce screenshots for a list of URLs
 
Search Engine Scraper
 
Table to Net
Launch toolInstructions & Scenarios of Use
Extract a network from a table. Set a column for nodes and a column for edges. It deals with multiple items per cell. (by Médialab Sciences-Po)
 
Text Ripper
Launch toolInstructions & Scenarios of Use
Rip all non-html (i.e. text) from a specified page.
 
Timestamp Ripper
Launch toolInstructions & Scenarios of Use
Rips and displays a web page's last modification date (using the page's HTML header). Beware of dynamically generated pages, where the date stamps will be the time of retrieval.
 
Tumblr
Launch toolInstructions & Scenarios of Use
a simple co-hashtag and post data tool for Tumblr
 
Twitter Capture and Analysis Toolset (DMI-TCAT)
Launch toolInstructions & Scenarios of Use
Captures tweets and allows for multiple analyses (hashtags, mentions, users, search, ...)
 
Wikipedia Cross-Lingual Image Analysis
Launch toolInstructions & Scenarios of Use
Makes the images of all language versions of a Wikipedia article comparable.
 
Wikipedia Edits Scraper and IP Localizer
Launch toolInstructions & Scenarios of Use
Scrapes Wikipedia history and does IP to Geo for anonymous edits
 
Wikipedia Entry Check
Launch toolInstructions & Scenarios of Use
This tool checks if the issues exist as a Wikipedia page, i.e., an article. If it exists it checks whether the organization is mentioned on that page.
 
Wikipedia History Flow Companion
Launch toolInstructions & Scenarios of Use
This script allows you to specify a range of Wikipedia revisions for use with the History Flow visualization.
 
Wikipedia TOC Scraper
Launch toolInstructions & Scenarios of Use
Scrape Table of Contents for revisions of a wikipedia page and explore the results by moving a slider to browse across chronologically ordered TOCs.
 
Wikipedia categories scraper
Launch toolInstructions & Scenarios of Use
Scrape Wikipedia for the categories of articles and the categories of related articles in different languages.
 
YouTube Data Tools
Launch toolInstructions & Scenarios of Use
A collection of simple tools for extracting data from the YouTube platform via the YouTube API v3.
 
Zeeschuimer
Launch toolInstructions & Scenarios of Use
Zeeschuimer is a browser extension that monitors internet traffic while you are browsing a social media site, and collects data about the items you see in a platform's web interface for late...
 
iTunes App Store Scraper
Launch toolInstructions & Scenarios of Use
The iTunes App Store Scraper is a simple tool to extract the details of individual apps, collect their related apps, and retrieve a list of apps for a given keyword.
 
 
Topic revision: r16 - 18 Sep 2024, StijnPeeters
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback