Category Archives: Weekly Project Reports 2014

Weekly project reports from the various student DH projects going on during the Spring 2014 semester.

DH Box Takes Off

Cross-posted from the DH Box Blog: https://dhbox.commons.gc.cuny.edu/blog/2014/dh-box-takes-off


This is it: DH Box is officially launching. The Digital GC is presenting an evening of short talks from various CUNY Graduate Center digital initiatives today, May 12 — starting off with DH Box.

I wanted to take a moment to reflect on where DH Box started and how far we’ve come. We introduced our project in early February:

What is DH Box?

Not much, so far. But we intend it to be a portable, customized linux environment for Digital Humanities learners that can rely on incredibly inexpensive technology. All you really need is a computer that runs Linux (and a monitor and keyboard, of course!) — but the platform that excites us most is the Raspberry Pi, a tiny computer that sells for just $35. Imagine a collection of DH tools, pre-installed and configured, and a set of texts for users to interrogate — all on a portable and inexpensive device.

That’s a quote from our first blog post — and it illustrates the most drastic change to our project. DH Box’s founder, Stephen Zweibel, had originally envisioned DH Box as being scripts that, when run, installed common DH applications (think Omeka, MALLET, NLTK) onto the user’s system; additionally, DH Box could be shipped as its suite of tools pre-installed on the light and portable Raspberry Pi computer.

As DH Box developed, it took a shift in platform, moving away from the issue of dealing with the idiosyncrasies of each individual’s system, to hosting instances of a virtual computer that any user could launch.

This was a vast and visible shift. But, despite not being as drastic, many other project elements developed in the journey from DH Box’s inception to its official launch.

Continue reading

DH Box Development and Testing

We’ve made big strides developing the front end interface to launch a new DH Box, and the Welcome page/menu that acts as the DH Box ‘home base’. We received extremely helpful feedback from some generous volunteer user experience testers at City Tech, and valuable advice from Chris Stein, Director of User Experience for the CUNY Academic Commons.

The results of our first round of user experience testing gave our team some great insights, and a fresh perspective on the project. We learned that perhaps one of our biggest challenges is effectively conveying the concept of the project in a readily digestible way.

We discovered that users can easily get the impression that DH Box is essentially a website, when in fact it’s much more than that (it’s a computer!). It’s understandable that this virtual computer could be confused for a website since DH Box’s primary navigation happens through your web browser. A distinct IP address is assigned to each DH Box instance at the time of launch. DH Box users navigate to applications (Mallet, Omeka, etc.) through specific ports designated for each tool. The “port” is just a unique numeric identifier appended to the end of your DH Box IP address. This same protocol for assigning unique identifiers is the basis of the internet; there’s an IP address behind every website.

We as a team are now reexamining how to explain the system of navigation, along with all of the fantastic stuff a virtual computer can offer so that users will be ready to push DH Box to the limit.

Outreach: collaborative promotion

To promote a project in development is not an easy task but it isn’t impossible. Like other projects of the DH class, the outreach approach of Beyond Citation has been conceived as a collaborative effort from all team members.We all feel confident that the project has much to offer.  The question really is: how to make it known to our potential users.  However it is hard to measure what a strong audience is.  Right now we have a monthly average of 285 unique blog  visitors.  Is this number enough? What is considered a success in the online world?

Also, it is important to keep in mind that outreach isn’t a popularity contest; it is a combination of individual and collective actions working toward a common goal of engagement with the project. And that entails more than just counting the number of visitors.

Therefore, it was imperative to understand who is our core audience.  Based in the type of found on Beyond Citation, we believe that scholars and users of academic databases will be our core audience. But what is the best way to reach them?  Once the website is fully operational, having a minimum of 28 monthly users could be considered an outreach achievement. That could tell us that at least 10% percentage of our blog audience understood what Beyond Citation is.

WordPress is used for blogging, with different members of the group contributing. They have covered topics from understanding what is an academic database to questioning the importance of digital tools in the academic world.  We are having a good response from the on-line community, many of the post have gotten feedback through comments, tweets and retweets. The blog has become the main voice of the project while the platform is still under construction.

Since Beyond Citation is a digital project it seemed logical to use digital tools such as WordPress, Twitter and  LinkedIn for online promotion.

Twitter  has provided us with valuable ways to interact with scholars and members of the academy.  This powerful tool is the main social network to promote Beyond Citation. According to our Google Analytics report 95% of social referrals come from shared links from Twitter.

On the other hand, LinkedIn is a relative new outreach strategy.  The principal reason to have a LinkedIn account is to create a deeper online presence. This social network allows us to find specific users (based in their professional profile) and to establish different paths for promoting our project on the web.

Nevertheless there are some concerns surrounding what to do next. Press kits, tutorials and even podcasts are possible future outreach actions. However, we still need a final product, something more tangible to promote. Then each member of the team will have another task: keeping the interest of the users.

Collaborative Opportunities

The Travelogue team has been exploring how other sites are using maps as digital pedagogical tools.  We are also connecting with possible collaborators, including other mapping projects, educational institutions and libraries.

In an effort to be participate in the conversations happening on social network platforms, Travelogue has been monitoring how Twitter is being used by similar projects.  We have explored hashtags that are being used in reference to maps, are concerned with literature, teaching, English, History, Social Studies, high school teachers, lesson plans etc.  We have also been following the conversations/posts on the Humanities, Arts, Science and Technology Alliance and Collaboratory (HASTAC) site.

On the development front we are playing with several WordPress Child Themes to see which will best work for the Travelogue site and the ESRI Storymap we will be using.  Research wise, we have completed a workable draft of the Ernest Hemingway content spreadsheet which we will use to construct Travelogue’s Ernest Hemingway StoryMap.

The Travelogue Commons site has a Research section that is categorized and features helpul resources, compiled during the progression of the Travelogue project.  For example, Esri Storymaps for Education.

Thank you for following our journey.  We look forward to sharing our connections with others in the GIS world.

If you want to contact us please do. Our project blog is at  travelogue.commons.gc.cuny.edu. Email us at dhtravelogue [at] gmail [dot] com or follow us on Twitter @DhTravelogue

Thinking About Authority and Academic Databases

Beyond Citation hopes to encourage critical thinking by scholars about academic databases. But what do we mean by critical thinking? Media culture scholar Wendy Hui Kyong Chun has defined critique as “not attacking what you think is false, but thinking through the limitations and possibilities of what you think is true.”

One question that the Beyond Citation team is considering is the scholarly authority of a database. Yale University Library addresses the question of scholarly authority in a handout entitled the “Web vs. Library Databases,” a guide for undergraduates. The online PDF states that information on the web is “seldom regulated, which means the authority is often in doubt.” By contrast, “authority and trustworthiness are virtually guaranteed” to the user of library databases.

Let’s leave aside for the moment the question of whether scholars should always prefer the “regulated” information of databases to the unruly data found on the Internet. While Yale Library may simply be using shorthand to explain academic databases to undergraduates, to the extent that they are equating databases and trustworthiness, I think they may be ceding authority to databases too readily and missing some of the complexity of the current digital information landscape.

Yale Library cites Academic Search and Lexis-Nexis as examples of databases. Lexis-Nexis is a compendium of news articles, broadcast transcripts, press releases, law cases, as well as Internet miscellany. Lexis-Nexis is probably authoritative in the sense that one can be comfortable that the items accessed are the actual articles obtained directly from publishers and thus contain the complete texts of articles (with images removed). In that limited sense, items in Lexis-Nexis are certainly more reliable than results obtained from a web search. (Although this isn’t true for media historians who want to see the entire page with pictures and advertisements included. For that, try the web or another newspaper database). Despite its relatively long pedigree for an electronic database, careful scrutiny of results is just as crucial when doing a search in Lexis-Nexis as it is for an Internet search.

In some instances, especially when seeking information about non-mainstream topics, searching the Internet may be a better option. Composition and rhetoric scholar Janine Solberg has written about her experience of research in digital environments, in particular how full-text searches on Amazon, Google Books, the Internet Archive and HathiTrust enabled her to locate information that she was unable to find in conventional library catalogs. She says, “Web-based searching allowed me not only to thicken my rhetorical scene more quickly but also to rapidly test and refine questions and hypotheses.” In the same article, Solberg calls for “more explicit reflection and discipline-specific conversation around the uses and shaping effects of these [digital] technologies” and recommends as a method “sharing and circulating research narratives that make the processes of historical research visible to a wider audience . . . with particular attention to the mediating role of technologies.”

Adding to the challenge of thinking critically about academic databases is their dynamic nature. The terrain of library databases is changing as more libraries adopt proprietary “discovery” systems that search across the entire set of databases to which libraries subscribe. For example, the number of JSTOR users has dropped “as much as 50%” with installations of discovery systems and changes in Google’s algorithms. Shifts in discovery have led to pointed discussions between associations of librarians and database publishers about the lack of transparency of search mechanisms. In 2012, Tim Collins, the president of EBSCO, a major database and discovery system vendor, found it necessary to address the question of whether vendors of discovery systems favor their own content in searches, denying that they do. There is, however, no way for anyone outside the companies to verify his statement because the vendors will not reveal their search algorithms.

While understanding the ranking of search results in academic databases is an open question, a recent study comparing research in databases, Google Scholar and library discovery systems by Asher et al. found that “students imbued the search tools themselves with a great deal of authority,” often by relying on the brand name of the database. More than 90% of students in the study never went past the first page of search results. As the study notes, “students are de facto outsourcing much of the evaluation process to the search algorithm itself.”

In addition, lest one imagine that scholars are immune to an uncritical perspective on digital sources, in his study of the citation of newspaper databases in Canadian dissertations, historian Ian Milligan says that scholars have adopted the use of these databases without achieving a concomitant perspective on their shortcomings. Similarly to the Asher et al. study of undergraduate students, Milligan says, “Researchers cite what they find online.”

If critique is, as Chun says, thinking through the limitations and possibilities of what we think is true, then perhaps by encouraging reflective conversations among scholars about how these ubiquitous digital tools shape research and the production of knowledge, Beyond Citation’s efforts will be another step toward that critique.

We are at blog.beyondcitation.org. Email us at BeyondCitation [at] gmail [dot] com or follow us on Twitter @beyondcitation as we get ready for the launch in May.

Travelogue: Format Selection and Other Updates

The team chose the ESRI ArcGIS Storymaps platform for the Travelogue project.  Last week the team had a vote on which ESRI ArcGIS Storymaps format to go with, the options were:

Sequential, Place-based Narratives Map Tour http://storymaps.arcgis.com/en/app-list/map-tour/

A Curated List of Points of Interest Short List http://storymaps.arcgis.com/en/app-list/shortlist/

Comparing Two or More Maps Tabbed Viewer  http://storymaps.arcgis.com/en/app-list/tabbed-viewer/

Comparing Two or More Maps Side Accordion http://storymaps.arcgis.com/en/app-list/side-accordion

A Curated List of Points of Interest Playlist http://storymaps.arcgis.com/en/app-list/playlist

The winner was…Map Tour http://storymaps.arcgis.com/en/app-list/map-tour/

Each team member has an Esri ArcGIS organizational account that can be used to practice and publish.  With the format selected and a large volume of research content done we can now start building.  The American authors that we have chosen to initially feature are Zora Neale Hurston and Ernest Hemingway.  We have shared Google Drive folders for each that feature spreadsheets with the research collected so far.  The spreadsheet entries are organized with a unified chronological date so that the journeys can be mapped chronologically.  All of the locations on both spreadsheets also have coordinates.

Informational text about each author is being written and audiovisual material to be featured on the Travelogue site is being collected.  Notably, direct links to Hemingway images from the JFK Library’s Media Gallery http://www.jfklibrary.org/JFK/Media-Gallery.aspx For the content sources we have chosen to use the MLA citation format.

The Travelogue’s Twitter account has received a few new followers.  Also, a Travelogue tweet was favorited by a San Francisco Chronicle newspaper Book Editor (all acknowledgements count).  The Twitter logo has been redesigned.  The look of the Twitter page has been updated to reflect the biblio and cartographic aspects of the project. Check it out @dhtravelogue

The team is looking forward to providing a status update presentation to the DH Praxis class on Monday, March 24th.

If you want to contact us please do. Our project blog is at  travelogue.commons.gc.cuny.edu. Email us at dhtravelogue [at] gmail [dot] com or follow us on Twitter @DhTravelogue

Finding a Home: Travelogue Picks a URL

The Travelogue team has been navigating the URL waters (travel puns abound but URL names do not).  By Monday, March 17th the URL had been decided upon and purchased.  Details soon to follow (we will let you know when to begin the drum roll).

Other updates: On the Travelogue’s Commons page the Twitter feed has been updated removing the icons and making it more text based.  The team is also choosing between paper texture images to be used for the Travelogue’s Commons site background, consulting with guides on 2014 web design trends.  We have been actively working on the Zotero citations for the content that will be featured on the Travelogue site.  Meet-ups outside of normal class hours have been scheduled.  We have been outlining the research that has been done so far and what needs to be worked on.  Zora Neale Hurston and Ernest Hemingway are the two American authors that the Travelogue project will initially focus on.  Research wise, we are currently working on historical context, researching what was going on in the locations that they traveled to during their time there.

If you want to contact us please do. Our project blog is at  travelogue.commons.gc.cuny.edu. Email us at dhtravelogue [at] gmail [dot] com or follow us on Twitter @DhTravelogue

DH Box: Making a place in the Cloud

The DH Box team has made exciting strides over the past week!

As some may know, DH Box will be available on a pre-installed, pre-configured Debian cloud server. To achieve this, we are using Amazon Web Services. For those who aren’t initiated, AWS is a vast cloud computing infrastructure (with internet servers throughout the world) that offers services very similar to what a physical computer would. But AWS brings unrivaled scale, flexibility, and economy (pay as you go pricing).

DH Box’s Intro to Cloud Computing

Dennis Tenen led the DH Box team through it’s first group workshop on setting up a virtual web server image, a.k.a. an “EC2 Instance.” The virtual web server contains an Amazon Machine Image (think of it as an identical copy) of an operating system. DH Box will be freely available for users to launch their own instance of ours. This solution saves users the trouble of downloading and installing tools to their own computers.

What do users need to access DH Box in the Cloud?

It will be pretty simple- users must sign up for a free AWS account. And we’re making use of AWS’ CloudFormation (templates that deploy services rapidly) utility to automate many of the steps required to launch a new AMI instance. We also have custom scripts to automate the launch of DH Box files and software once users copy our server image. We’re really excited about being introduced to this powerful service, and even more encouraged that our configuration templates will allow DH Box users to dive swiftly into DH inquiry.

This is just the beginning- we’re focusing heavily on providing thorough documentation so that DH Box users will have everything they need to get up and running. Stay tuned!

Special thanks to Prof. Dennis Tenen for his amazing Intro to Cloud Computing Workshop.

Beyond Citation: Building digital tools to explain digital tools

Over the last couple weeks, the Beyond Citation team has transformed into a web production team of sorts, focused on making key decisions about platform, site architecture, user interaction, design, and communication.

Beyond Citation—a project to build a website that aggregates accessible, structured information about scholarly databases—has the potential to enhance how scholars approach, use, and interpret resources from some of today’s most widely used digital collections. While it would be straightforward for our team to simply gather and publish information about those resources, our challenge is to build a digital tool that supports meaningful interaction with that information, one that can also scale in the future and cater to a community of contributors.

In the project’s nascent stages, the tactical concerns before us are familiar—we’re taking on the common challenge of building and launching a website or web app. Thrust into the very practical realm of software, decisions, and constraints, discussions of critical theory get put off to discuss the merits of WordPress and Drupal. These powerful tools place the project in a digital ecosystem much wider than academia. The platform we have chosen—WordPress—pushes us deeper still into the wide worlds of relational databases, server-side scripting, and content management—the digital tools that will allow us to explain other digital tools.

As we construct the basic building blocks for the site, we find that the best way to focus our approach is by seeking the advice of experts, reading blogs about WordPress customization, and learning more about MySQL and WordPress taxonomies. The robust open source community behind WordPress has enabled us to confirm that the technical requirements for the Beyond Citation website can be met many times over through combinations of WordPress plugins.

Something to consider while building this tool with WordPress, is that we are seeking to publish data about proprietary tools by using open source technology. Perhaps this isn’t really so unusual—we see this in a similar vein as increasingly popular APIs that allow for easier data aggregation or configuration from multiple sources. And toolsets that are hybrids of proprietary and open source systems are extremely common.

But there’s an important depth to explore when thinking about Beyond Citation as a bridge between proprietary and open source systems. The idea of “exposed” information, built on “hidden” information, represents a direction that the project can try to push technically. For instance, if in a future iteration the team can uncover information about scholarly databases that’s not just hard to find, but not openly available (such as how search algorithms work, or the criteria behind publisher contracts), then I think the value of Beyond Citation increases in a direction most closely aligned with its original ambition. This would also allow the project to explore the similarities and differences in how scholarly databases work in more meaningful ways.

Before we can do that, everyone on the team is doing their part to fill in knowledge gaps, and discovering “how technology works” on multiple levels. Just as we are researching the types of information about scholarly databases that we want the project to highlight, we are also researching the types of data-driven web frameworks that could easily support such information. Like many Digital Humanities projects, Beyond Citation is about knowledge acquisition and aggregation for both developers and researchers. We are challenging ourselves to learn as much as we can about one set of digital tools before we can communicate new information about other sets of digital tools—both of which are moving targets, evolving in their own realms of authorship.

As we work towards a May launch date for an early version of the site, we realize that the authors of digital projects need a constant appetite for more knowledge—technical knowledge and subject-matter knowledge—in order to create and maintain an authoritative tool.

Follow us on Twitter as we get ready for May: @beyondcitation

It’s a Two-Fer!

Travelogue group members
Sarah – Project Manager
Amy – Technology and Design
Melanie – Outreach and Communication
Evonne – Research
Adam – Technology and Design

Last week, due to illness, the Travelogue’s outreach and communication person was ironically silenced.  However, that means this week there is twice as much Travelogue team blog fun to catch up on!

Travelogue’s Twitter page has a great new logo courtesy of Adam.  Initially, we had encountered an issue with the size of the first Travelogue logo not looking great sized down for Twitter.  Adam also created the Travelogue logo that appears on the Travelogue’s Common’s page.  Throughout the design process, Adam shared drafts for input from the group.  Amy has been hard at work on the design and content of the Travelogue’s Common’s page.

Last Monday on March 3rd the team, sans one under the weather outreach and communication member, presented an update on the project status to the DHPraxis class.  In preparation, Sarah created an action plan outlining how each team member could explain the progression the team has made so far.

Sarah met with our DH Praxis professor Matt Gold to go over the scope of the project and get his input on the current ideas the team has.  Sarah is working on the Travelogue website’s wireframe and created a mock up of the layout.  Also, she is continuously working on the project plan.  The team has been actively communicating, to organize the communication and each team member’s responsibilities, Sarah established an Asana page for the team.

Evonne has been compiling research resources, organizing the research conducted, what needs to be further researched and maintaining citations in a Travelogue Zotero page.  Using Evonne’s extensive research as a guide and the Gale database Directory of Special Libraries and Information Centers, Melanie has been reaching out to multiple academic institutions.  The preliminary goal is to introduce the Travelogue project, request info on the usage of content (for example from the Library of Congress) and building relations from there.  Through the Travelogue Twitter account Melanie has followed organizations working on mapping projects  and will be actively working creating engaging content in the pursuit of followers.

The team has been exploring ArcGIS Story Maps as the mapping tool for the project.  A schedule of meetings outside of class is being established as to best collaboratively brainstorm face to face.  The team is looking into whether Travelogue will be paralleling the travel narratives of the chosen authors (Ernest Hemingway and Zora Neale Hurston), literally displaying the travel trajectories of both on the same map?  Or, will each author’s journey be depicted on a separate map?  The website’s URL is also currently being decided upon.

If you want to contact us please do. Our project blog is at  travelogue.commons.gc.cuny.edu. Email us at dhtravelogue [at] gmail [dot] com or follow us on Twitter @DhTravelogue