Travelogue team journal post #1

Travelogue group members
Sarah  – Project Manager
Amy  – Technology and Design
Melanie  – Outreach and Communication
Evonne  – Research
Adam  – Technology and Design

The Travelogue project will disrupt and broaden the expatriate narrative, while at the same time compiling American literary travel narratives and timelines with web mapping.  Mapping these journeys for display on an interactive website will provide both a visual and theoretical representation of modern literary movements in America, enabling the humanities community to gain a broader understanding of the history and underlying structure of these works.  It will also act as a pedagogical tool, allowing students to see narratives and literary movements represented through interactive, visual means, and as a general source of information for a wider public audience.

Thursday, February 20th

The team has been off to a successful start, communicating consistently through the Travelogue CUNY Commons group page that Amy created.  As a group, we have been discussing what the scope of the project is and what we would like it to look like.

Sarah created a Google folder for the project.  The folder features the project plan Excel spreadsheet and sheets for info on each of the four authors Travelogue will feature.  Sarah has been providing an outline for the project scope, noting details of the author’s “life journey” that Travelogue should be highlighting.

We have been exploring a diverse list of American authors that have traveled substantially and or lived abroad.  This week we plan on solidifying the list of four authors.  Zora Neale Hurston http://chdr.cah.ucf.edu/hurstonarchive/ and Ernest Hemingway http://www.jfklibrary.org/Research/The-Ernest-Hemingway-Collection.aspx will most likely be featured.  Evonne has been researching the authors, narrowing down the list to authors that fit the Travelogue criteria, and have the greatest volume of digital content available.  She has created a Google doc with the data collected.

Amy and I have been researching tutorials and guides for the possible platforms.  We have been sharing the info and links on the group’s Commons page.  Amy and I have also researched possible authors to feature, focusing on female authors.  I created a Twitter account for Travelogue and shared the account info with the group.  During the next collaborative class session, I will inquire as to what the best practices are for sharing project progression details publicly through social media.

Possible platforms the group has discussed:

– CartoDB
– Mapbox
– Google Maps + Google Fusion Tables
– Omeka + Neatline

Adam sketched a logo for Travelogue.  We all agreed it was great.  He has scanned it and has been actively sharing drafts of the logo with the group as he works on the design.  Adam has also been researching Neatline+Omeka, along with other platforms and tutorials.  The group is looking forward to consulting with Steven Romalewski on which platform would be best and most feasible within the scope of the project.  The front runner, platform-wise, has been Omeka+Neatline.  Sarah has also been researching CartoDB, its functionalities and  the cost involved in the usage of CartoDB.

If you want to contact us please do. Our project blog is at  travelogue.commons.gc.cuny.edu. Email us at dhtravelogue [at] gmail [dot] com or follow us on Twitter @DhTravelogue

DH Box: Tackling Project Scope

We have this great Digital Humanities project idea, but what happens between now and launch time?

With an idea like DH Box (a customized linux OS with preinstalled DH Tools and the flexibility to operate on a computer as cheap and portable as the Raspberry Pi) there are a number of directions we could take, and will certainly consider for further iterations of DH Box beyond the Spring term (this blog currently documents the experiences of a project team enrolled in a graduate course in Digital Humanities Praxis at the Graduate Center, CUNY).

In order to refine the scope of our tool, we asked ourselves some questions:

  • What approach will we take around educating users about coding, the infrastructure around the DH Box software, hardware, and operating system?
  • Which DH Tools should we include? See Alan Liu’s curated list for more info on the scope of DH tools out there
  • What user(s) are we building this for?

The success of our project hinges on our ability to carefully model the scope of the tool by shaping the answers to these questions . . . all by May 12th (public launch date)!

Educational Value

Beyond providing a collection of accessible DH Tools, we want DH Box to help bridge knowledge gaps by delivering a strong educational component. We’d like for instance, undergraduate English students to gain exposure and develop proficiency in Digital Humanities inquiry through the kind of guidance and practical experience DH Box will offer. To that end, we will begin an interactive textbook to provide instruction about the specific tools included in this first iteration of DH Box. We are most inspired by the Learn Code the Hard Way interactive textbook series by Zed Shaw.

Tools

We are gearing this version of DH Box to bring Topic Modeling and Text Analysis to Humanities students!

We began by considering the most popular DH Tools out there and quickly realized it made a lot of sense to whittle the list down for this current project phase. We’ve made choices based on optimal software performance with the Raspberry Pi. We also want to provide DH Tools that haven’t yet had the level of proliferation like some of the more popular content management systems such as WordPress.

Users

Undergraduate Humanities students currently have little familiarity with terms like tokenizationsentiment analysis, etc., and how these components of text analysis can open expansive modes of textual inquiry. As part of its mission, DH Box will work to make these methods accessible to a broad audience!

Stay tuned for exciting updates on implementing the install scripts, using IPython Notebook, and more!

 

Questions? Comments? Tweet us!

Easy Access to Data for Text Mining

Prospect Workflow

Will 2014 be the year that you take a huge volume of texts and run them through an algorithm to detect their themes? Because significant hurdles to humanists’ ability to analyze large volumes of text have been or are being overcome, this might very well be the year that text mining takes off in the digital humanities. The ruling in the Google Books federal lawsuit that text mining is fair use has removed many concerns about copyright that had been an almost insurmountable barrier to obtaining data. Another sticking point has been the question of where to get the data. Until recently, unless researchers digitized the documents themselves, the options for humanities scholars were mostly JSTOR’s Data for Research, Wikipedia and pre-1923 texts from Google Books and HathiTrust. If you had other ideas, you were out of luck. But within the next few months there will be a broader array of full-text data available from subscription and open access databases.

CrossRef, the organization that manages Digital Object Identifiers (DOIs) for database publishers, has a pilot text mining program, Prospect, that has been in beta since July 2013 and will launch early this year. There is no fee for researchers who already have subscription access to the databases. To use the system, researchers with ORCID identifiers log in to Prospect and receive an API token (alphanumeric string). For access to subscription databases, Prospect displays publishers’ licenses that researchers can sign with a click. After agreeing to the terms, they receive a full-text link. The publisher’s API verifies the token, license, and subscription access and returns full-text data subject to rate limiting (e.g. 1500 requests per hour).

Herbert Van de Sompel and Martin Klein, information scientists who participated in the Prospect pilot, say “The API is really straightforward and based on common technical approaches; it can be easily integrated in a broader workflow. In our case, we have a work bench that monitors newly published papers, obtains their XML version via the API, extracts all HTTP URIs, and then crawls and archives the referenced content.”

The advantage for publishers is that providing access to an API may stop people from web scraping the same URLs that others are using to gain access to individual documents. And publishers won’t have to negotiate permissions with many individual researchers. Although a 2011 study found that when publishers are approached by scholars with requests for large amounts of data to mine they are inclined to agree, it remains to be seen how many publishers will sign up for the optional service and what the license terms will be. Interestingly, the oft-maligned Elsevier is leading the pack having made its API accessible to researchers during the pilot phase. Springer, Wiley, Highwire and the American Physical Society are also involved.

Details about accessing the API are on the pilot support site and in this video. CrossRef contacts are Kirsty Meddings, product manager [kmeddings@crossref.org] and Geoffrey Bilder, Director of Strategic Initiatives [gbilder@crossref.org].

 

Redefining DH

The first semester of the Digital Praxis Seminar was an inspiring invitation into the new age of scholarship. The lecture series set a compelling foundation for engaging the Digital Humanities, and opened a portal to possibility. The seminar led me to imagine how I could elevate my own scholarship in the midst of today’s Information Revolution. It challenged me to consider ways to overcome traditional text-based modes of humanities scholarship, and conceive of new mediums to give scholarship greater relevance and influence in mainstream society.
At the close of the first semester, I find myself reflecting heavily on the Digital Humanities. At the beginning of the semester when we were asked to define Digital Humanities, I had trouble coming up with a definition. As I attempt to redefine the field now, one word comes to mind: Possibility. The Digital Humanities is all about possibility. It’s about the possibility that comes from collaboration, creativity, problem solving, technology, scholarship, and innovation.
I am very excited to begin working on projects after the break! I look forward to seeing you all. All the best for a happy and healthy new year!

Resources for Film Studies Projects

As I know there are at least a couple other film studies people here, and hopefully others are interested as well, below is a non-exhaustive list of possible tools and/or resources for film analysis. One final note that I would like to add is that I think these tools are productive for stimulating both analytical and creative abilities, the latter of which is often lacking in traditional humanities scholarship and pedagogy.

  • Digital Storytelling & Animated GIFs – digital storytelling seems to be growing in undergraduate and K-12 curriculums. This could be a great tool for humanities-based coursework as it allows students to think differently about how stories and films are constructed. Recording/editing mechanisms are now inexpensive and somewhat ubiquitous, and platforms like YouTube can easily publicize a student’s work. Animated GIFs may perform a similar function. Matt pointed me to Jim Groom’s blog, which is very interesting: http://bavatuesdays.com/how-i-stopped-worrying-and-learned-to-love-the-gif/
  • ClipNotes for iPad – this is a very cool app for doing film studies, though at the moment, it is extremely difficult to share one’s work, and use is obviously limited to iPad owners. http://www.clipnotes.org/
  • Visualization – earlier in the semester we looked at Brendan Dawes’ “Cinema Redux” project, which is perhaps the best example, though varying approaches to visualizing films are possible. http://brendandawes.com/projects/cinemaredux
  • Cinemetrics – this is a great tool for doing film measurement analysis. The website contains detailed information, a database, and some written scholarship on the topic. http://www.cinemetrics.lv/
  • Max 6 – we used Max with Phidgets during Bill Turkel’s workshop earlier in the semester. Max contains several free tutorials on working with video clips in the program. There are some very cool possibilities. http://cycling74.com/products/max/

I hope everyone has a nice break!

Race, Surveillance and Technology

The lecture on race, surveillance and technology captured my attention in a way that no other lecture this semester had.  I very much appreciate Ms. Simone Browne’s candid approach to this very difficult subject and the compelling discussions that followed, as well as the discussion with Zach Blas around protecting privacy during the informative workshop that followed. 

The abhorrent history of the branding of slaves, both on eastern and western shores provides a reference to understand corporal punishment and the mass categorizations of human beings as “other” as a societal norm.  Given this background, so long as there is this notion of “otherness”, it is concerning that the use of biometric technology as a surveillance tool can become a great detriment to society especially since anyone can access this technology.  On the upside, the government uses biometric technology as a protection device against terrorist at the country’s ports of entry.  However as was noted during the lecture, we know the private sector can collect data in the form of capturing one’s fingerprint so long as the public acquiesces to finger scans.  What can this mean for those who are able to implement the use of these private treasure troves?  Will technologies such as these effect future generations in adverse ways?  Should we assume such data collections will always be used to aid the human condition and not harm it?  As the public becomes better informed, will concerns around privacy once again explode onto the national scene? Shouldn’t they?

Researcher Beware!

Genevieve asked a great question Monday night: How does one verify the data reflected in mapping platforms?

In his answer, Steve Romalewski stressed that when examining any kind of data, critical thinking is essential. First, look at the metadata to see when the content was updated, look to see where the information came from, who gathered it, what it includes and, by extension, excludes.

Both Genevieve’s question and Steve’s answer underscore the importance of critical thinking and content transparency in the myriad digital tools we use everyday for research. There is often a false sense of security when searching online platforms that the content will be there and that it will be true. And if it’s not there, then it must not exist at all.

To some extent, it might not exist, online anyway. Take, for example, an online full text historical newspaper archive. While the platform may advertise a specific title as being in the full text, it doesn’t necessarily tell you that only select issues are available. The hapless researcher, plugging in keywords and getting nowhere might not be aware of that gap in coverage, and so gets…nothing. If she had known the inclusion dates of that digital archive, she might’ve known that while her online search might yield very little, a spin through a physical microfilm reel might prove enormously fruitful —  albeit a lot more time consuming.

As we increasingly rely on digital tools for research, sometimes to the exclusion of other resources, we must always be aware of the ways the resources are structured and the content they provide. With that knowledge, we’ll have much more manageable expectations of what can be found, how best to approach it for research, or whether someone is better off consulting another full text portal: the physical book.

DH Thesis

This is a message from my friend Anderson who was handing out a paper about his thesis during our last class:

Well this is embarrassing…

Everyone who attempted to access my app, I really appreciate it, but Amy Wolfe was kind enough to let me know that I had incorrectly transcribed the URL.

The correct URL for the site is: https://boiling-wildwood-9939.herokuapp.com/ or http://boiling-wildwood-9939.heroku.com

I hope you all will check it out.

Thanks again, Anderson Evans e-mail: jevans@gc.cuny.edu twitter: @Anderson_Evans

The Gentle Introduction Resource

boiling-wildwood-9939.herokuapp.com

The G.I.R. is a Rails based web app that hopes to collect a specific collection of crowdsourced academic resources. This app is maintained by Anderson Evans as the core of his thesis for the MALS degree in Digital Humanities at CUNY Graduate Center.