David Moats
Ben Koeck
Michael Van Der Haagen
Vera Franz
Kevin Deluca
Tommaso Elli
Adrian Bertoli
This topic and data set was dealt with the previous summer in <a href="https://wiki.digitalmethods.net/Dmi/DetectingTheSocials" target="out">Detecting the Socials</a> but the focus was on extracting forms of sociality through social media, rather than the substantive debate itself. What the previous project did find was that before Snowden, the privacy query was focused on celebrity and personal topics but became highly politicised and specific following the leaks. But quickly It was particularly interesting to look at this topic a year later because of the possibility of anniversary commemorations and also Glenn Greenwalds book tour.
1. Characterising the debate: Jun 2013 and now
How has the debate developed one year on? Is it driven more by news or by professional campaigners? Is it in contrast an individual, personal issue? There were also sub issues such as whistleblower advocacy, legal reform in the US, and other ways of scoring political points off the controversy. Which is most prevalent then and now
2. Who are the debate leaders (institutions or individuals)
Along the same lines, which actors, institutions or sources driving the debate? Is the UK merely reproducing the US sources, compared with Europe?
3. Geolocate the debate: US versus UK?
Finally, our topic expert, Vera Franz was particularly interested in whether or not the topic of privacy had remained international or focused more on US campaigns aimed at changing policy. Others were curious if this were a regional issue, city versus rural or Red State versus Blue State.
For this project we used the TCAT data set Privacy which queries for privacy, surveillance and facebookprivacy. There was another data set which scraped for a different set of terms around Prism NSA and Snowden but this was not consistent for both periods and also would not contain the more personal, less politicised elements of the debate.
We divided up the data set into 3 weeks for each year: before, during and after the snowden leaks and before, during and after and during the anniversary.
Week 1: 31.05.2013 - 05.06.2013
Week 2: 06.06.2013 - 12.06.2013
Week 3: 13.06.2013 - 20.06.2013
Week 1: 31.05.2014 - 05.06.2014
Week 2: 06.06.2014 - 12.06.2014
Week 3: 13.06.2014 - 20.06.2014
We established subgroups based on specific digital objects: users and hashtags which we investigated using networks and qualitative content analysis. We used mention networks and co-hashtag networks to profile the debate and debate actors respectively.
Simultaneously, we investigated the possibilities for locating the debate using different types of geographic traces: geocoded coordinates, Timezones, language and UTC offset.From the co-hashtag networks, we could clearly discern a shift in the framing and agenda setting of the issue as #privacy became attached to more political, as opposed to personal hashtags. In the second and third weeks there was the emergence of party political hashtags such as #tcot (top conservatives on twitter) and #lcot (top liberals) along with other non-privacy specific causes like #ows.
2013 Data
Week 1: eff, privacyint, txitua, cyberrights, bettybrowser, privacycamp, canaryorg, aclu, ioerror, privacysurgeon, mashable, liberationtech Not included because not directly relevant: sirbasstoven, paulbernaluk, copyright_italy
Week 2: guardian, ggreenwald, algore, nytimes, senrandpaul, guardianus, kimdotcom, anonyops, eff, barackobama, ioerror, trevortimm, Thomas_drake, normative, democracynow. Not included: kimjongnumberun, damienfahey, stephenathome
Week 3: thomasdrake, ggreenwald, eff, privacyint, aclu, ioerror, csoghoian, guardian, washingtonpost, liberationtech, trevortimm, barackobama, youranonnews
2014 Data
Week 1: Very activist campaign oriented, are clearly dominating the discussion - fightfortheftr, youowntheweb, eff, youranonnews. News based towards the anniversary: the guardian, techsites such as wired, ap, guardiantech
Week 2: Less campaing based, more activist base (privacyint, privatelocknet, doctorow, arusbridger). Guardian mentioned more than week 1. discussion moving into more specific topics. Other important actors are fightfortheftr, flemingjude, eff, and paulnemitz.
Week 3: Campaigning and activist hype is over. We are left with experts and clear subconversations. Canadian (mgeist) / European (casperbowden, tomwatson, paulnemitz) / US (oaklandprivacy, ) divide. Transpartisan public policy debate between liberals and conservatives in the US.
Before Snowden, the issue space was very activist based, whereas in week 2 and week 3 there is enormous noise, particularly coming from Newsy actors. There were also several leaders in the discussion. Similar to week three. Within the trending topic it has been found that spam or unrelated actors are using the hashtag to promote other products or services.
In 2014 there was a rise in discussion before the actual anniversary, but the peak quickly disappears and discussions are forming and can be obtained more clear.
In terms of the presumed noise from spam bots and promotional sources we noticed a market discrepency between the rate of tweeting and the amount of mentions received. At one end of the scale were actors who only received mentions but did not participate (public figures like president obama) and at the other end were mainly bots who tweeted frequently but were ignored by most other users. In future a tweet mention ratio could be used to filter out some of the less relevant accounts.
EU: tom_watson, greenjennyjones, robevansgdn, raycorrigan, paulbernaluk, casparbowden, julianhuppert, jamesbruk, gurchetangrewal, cyberseckent, arusbridger, glynmoody, ianbrownoii, cnil, netizenrights (Italy subcluster: roccopanettait, annamasera, montecitorio)
Canada: mgeist, veritas_truth_, lawscribes, heatherrenwick, minpetermckay, citizenlab, josh_wingrove, privacylawyer, canadacjfe, jennbarrigar, chuddles11, patondabak, pdmcleod, caparsons, cippic, mcinnescooper
USA: repzoelofgren, repthomasmassie, pmocek, kevinbankston, senrandpaul, mostrolenk, 4yourfreedom, elizabeth_joh, joshgerstein, seattleprivacy, oaklandprivacy, astepanovich, a_greenberg
Australia: Asher_WolfWe then used this subset of actors to show the change in their top hashtag use before and after snowden, with the RAW bubble diagrams:
One other key event in the 2014 weeks was the publication of Glenn Greenwalds book No Place to Hide. We decided to test the viability of using geocoding data (primarily generated by smart phones) to locate discussions of Greenwald on his book tour. As we can see from the map below the tweets to some extent follow the path of his book tour down the east coast of the united states, but there is also significant international commentary as the tour goes on. This is because the book tour is not just a localised event but also includes tv appearences so is never fully grounded. Anonymous and Wiki Leaks founder Julian Assange both engaged in commentary on the book.
We had some skepticism towards the way place is mapped through Twitter so as an experiment we also decided to compare geocoding to the self-reported place listed on user profiles. Self-reported places were assigned a geocordinate and were connected to geocoded Tweets, both of which were placed on a map using Gephi. The results are striking in that Many users list their location as Alaska who are in the US or South East Asia - this is possible because Alaska would appear first on a drop down menu.
We finally attempted to locate the debate using another bit of Twitter data which is timezones (in relation to UTC time). This could be seen as more trustworthy than self reported location and more ubiquitous than geocoding. With a combination of language data and timezones most countries and regions within larger countries could be identified. In the map below we use a User-Hashtag Bi-Partite network of the 2013 dataset and plotted the users by Timezone allowing the Hashtags they used to gather around the relevant time zones. Languages other than english were colour coded.
As one can see the debate centres around the east coast of the US and the UK. Surveilance, Privacy and Prism are located in the atlantic between the two, while NSA is firmly located in the US debate. The US debate also has tcot and teaparty while #bigbrother is more linked to Britian - home of Orwell.