Category Archives: DH Box

DH Box Takes Off

Cross-posted from the DH Box Blog:

This is it: DH Box is officially launching. The Digital GC is presenting an evening of short talks from various CUNY Graduate Center digital initiatives today, May 12 — starting off with DH Box.

I wanted to take a moment to reflect on where DH Box started and how far we’ve come. We introduced our project in early February:

What is DH Box?

Not much, so far. But we intend it to be a portable, customized linux environment for Digital Humanities learners that can rely on incredibly inexpensive technology. All you really need is a computer that runs Linux (and a monitor and keyboard, of course!) — but the platform that excites us most is the Raspberry Pi, a tiny computer that sells for just $35. Imagine a collection of DH tools, pre-installed and configured, and a set of texts for users to interrogate — all on a portable and inexpensive device.

That’s a quote from our first blog post — and it illustrates the most drastic change to our project. DH Box’s founder, Stephen Zweibel, had originally envisioned DH Box as being scripts that, when run, installed common DH applications (think Omeka, MALLET, NLTK) onto the user’s system; additionally, DH Box could be shipped as its suite of tools pre-installed on the light and portable Raspberry Pi computer.

As DH Box developed, it took a shift in platform, moving away from the issue of dealing with the idiosyncrasies of each individual’s system, to hosting instances of a virtual computer that any user could launch.

This was a vast and visible shift. But, despite not being as drastic, many other project elements developed in the journey from DH Box’s inception to its official launch.

Continue reading

DH Box Development and Testing

We’ve made big strides developing the front end interface to launch a new DH Box, and the Welcome page/menu that acts as the DH Box ‘home base’. We received extremely helpful feedback from some generous volunteer user experience testers at City Tech, and valuable advice from Chris Stein, Director of User Experience for the CUNY Academic Commons.

The results of our first round of user experience testing gave our team some great insights, and a fresh perspective on the project. We learned that perhaps one of our biggest challenges is effectively conveying the concept of the project in a readily digestible way.

We discovered that users can easily get the impression that DH Box is essentially a website, when in fact it’s much more than that (it’s a computer!). It’s understandable that this virtual computer could be confused for a website since DH Box’s primary navigation happens through your web browser. A distinct IP address is assigned to each DH Box instance at the time of launch. DH Box users navigate to applications (Mallet, Omeka, etc.) through specific ports designated for each tool. The “port” is just a unique numeric identifier appended to the end of your DH Box IP address. This same protocol for assigning unique identifiers is the basis of the internet; there’s an IP address behind every website.

We as a team are now reexamining how to explain the system of navigation, along with all of the fantastic stuff a virtual computer can offer so that users will be ready to push DH Box to the limit.

Communicating Technical Process

With alpha work on DH Box wrapping up, it’s a good moment to reflect on some technical lessons learned, as well as some lessons about being on the technical side of a team. Up to this point, while I have been keeping my team apprised in general of DH Box’s technical situation as it progressed, most of the details of its implementation, as well as the specific tools I’ve used and their justifications, pros/cons, and possible alternatives, I have kept to myself.

This is, in part, due to the fact that I did not begin with a particular plan. Though we had a well-defined goal for DH Box, I knew that there were myriad ways to reach it. So I experimented with different methods of cloud deployment and server provisioning, that is, different ways of creating each new instance of DH Box and automatically installing all of the necessary software on it.

I started with a BASH script designed to run on the first boot of each new DH Box instance. This worked well enough, but didn’t offer much in the way of sophisticated automation or transparency for debugging. I then tried some of the more well-known server deployment/provisioning tools, like Puppet and Salt. Puppet I found less straightforward than I’d hoped, partially because it requires modules to be written in a homespun variety of Ruby, which I’m not super comfortable with. Salt did more of what I wanted, but I was still reading its documentation when I became distracted by yet another tool, Ansible.

Ansible turned out to be just what I needed: It is written in Python, a language I have more familiarity with, and it allows me to monitor each deployment of a new DH Box in real time. Using Ansible, I’ve been able to create a whole automation workflow in one language, and, even better, I can easily see if and at exactly which point a deployment fails. This is crucial to efficient problem solving and future updates for DH Box, as its installation process necessarily involves many separate moving parts.

With these details of DH Box’s technical framework determined, it’s possible to create a more concrete “blueprint”, and I’m now working with our project planner, Gioia, to incorporate much more specific technical milestones into our overall plan. Going forward, I hope to keep everyone up-to-date and communicate some of what I learn along the way, without getting us too bogged-down in technical minutiae.

DH Box considers deployment options

Cross-posted from the DH Box Blog:

Once DH Box knew the platform it would adopt, it was simply a matter of figuring out the best way to utilize that platform. But was it so simple?

What the DH Box Team has been tackling this week is striking a balance between providing a robust tool that is useful for the intended audience and whose maintenance is not insurmountable for its administrators.

To recap — the platform chosen for delivering the DH Box environment, ready with DH tools installed, is a web server image provided through Amazon’s AMI (Amazon Machine Image) appliance. This will deliver, in essence, an identical copy of a tool-laden operating system to any user’s system.

Choosing this platform offered important benefits — for example, freedom from having to address issues caused by tools being installed to users’ personal systems. However, it also introduced tension: to deploy images hosted by Amazon, one needs to use an Amazon account. Would we have users create their own Amazon Web Services (AWS) accounts that require credit card information (though launching the Image is a free service) or would we maintain an account that instances would be launched from and figure out how the DH Box team would handle potential related charges?

Many questions entered into this equation: Would our intended users be open to providing credit card information? Who might this alienate? Or, if we managed the AWS account with many instances running, would we incur charges we’re not prepared to deal with? What would be the time-period allotted to users for running the instances?

DH Box has had to think through how different deployment options (e.g. requiring users to have their own AWS accounts) might affect how DH Box will be adopted by intended users. And this — the tension between providing a service that is maintainable, sustainable, and at-once useful to the intended audience — is something any project like DH Box might face.

DH Box: Making a place in the Cloud

The DH Box team has made exciting strides over the past week!

As some may know, DH Box will be available on a pre-installed, pre-configured Debian cloud server. To achieve this, we are using Amazon Web Services. For those who aren’t initiated, AWS is a vast cloud computing infrastructure (with internet servers throughout the world) that offers services very similar to what a physical computer would. But AWS brings unrivaled scale, flexibility, and economy (pay as you go pricing).

DH Box’s Intro to Cloud Computing

Dennis Tenen led the DH Box team through it’s first group workshop on setting up a virtual web server image, a.k.a. an “EC2 Instance.” The virtual web server contains an Amazon Machine Image (think of it as an identical copy) of an operating system. DH Box will be freely available for users to launch their own instance of ours. This solution saves users the trouble of downloading and installing tools to their own computers.

What do users need to access DH Box in the Cloud?

It will be pretty simple- users must sign up for a free AWS account. And we’re making use of AWS’ CloudFormation (templates that deploy services rapidly) utility to automate many of the steps required to launch a new AMI instance. We also have custom scripts to automate the launch of DH Box files and software once users copy our server image. We’re really excited about being introduced to this powerful service, and even more encouraged that our configuration templates will allow DH Box users to dive swiftly into DH inquiry.

This is just the beginning- we’re focusing heavily on providing thorough documentation so that DH Box users will have everything they need to get up and running. Stay tuned!

Special thanks to Prof. Dennis Tenen for his amazing Intro to Cloud Computing Workshop.

Presenting… DH Box

In the interest of spreading the mission of DH Box far and wide, I’ve been working on a brief presentation that might also serve as an online introduction to the project. It’s available hereTake a look!

I’ll be using these slides to give a short talk about DH Box to faculty this Tuesday at Hunter College. It looks like we’ll be making quite a few presentations like this one, because as it turns out, building a community is one of the key factors determining success for DH Box. We will need the help of an invested community to:

  • Determine which tools should be included
  • Identify new platforms to target
  • Contribute to documentation
  • Spread awareness about DH Box

and it seems clear that in-person meetings and discussions are the best way for us to create interest in our work. That’s not to discount social media approaches at all; they allow for broad outreach we couldn’t manage otherwise. But in-person conversation allows us to demonstrate and discuss DH Box in greater depth, thus solidifying each potential user’s understanding and their relationship with us and our project.

New Friend, New Platform for DH Box

Cross-posted from:

This week the DH Box team reconsidered their choice of platform, with the help of Dennis Tenen, a professor at Columbia University in the Digital Humanities and New Media Studies program (and former developer with Microsoft).

A couple weeks ago we were surprised and delighted to find that another team had come up with the idea for a portable tool that could help users quickly get going with DH applications. And this week we found that Professor Tenen and colleagues had also discussed how to tackle such a project and had come up with yet a different solution! In discussing that solution, we found it matched our aim of providing an ease of quickly setting up an environment for new users and made us change our focus for both implementation and outreach.

Read more

Opening DH Box

This is it! The inaugural post of the DH Box blog (the DH stands for Digital Humanities). Here we intend to make the process of planning, creating, and publicizing the DH Box transparent for our readers. Hopefully this provides some inspiration, and even a blueprint, for future collaborative DH projects.

But let’s not get ahead of ourselves! First, some questions and answers:

What is DH Box?

Not much, so far. But we intend it to be a portable, customized environment for Digital Humanities learners that can rely on incredibly inexpensive technology. All you really need is a computer (and a monitor and keyboard, of course!) — but the platform that excites us most is the Raspberry Pi, a tiny computer that sells for just $35. Imagine a collection of DH tools, pre-installed and configured, and a set of texts for users to interrogate — all on a portable and inexpensive device.

What inspired the idea of DH Box?

Several ongoing humanities projects have begun to take advantage of the continuing miniaturization of computing technology. One in particular excited my imagination: Library Box, which repurposes a wireless router into a “portable digital file distribution tool…that enables delivery of educational, healthcare, and other vital information to individuals off the grid.” The possibilities for ’embedded’, specialized miniature computers are massive.

What is needed to run DH Box?

Our first major goal is to get DH Box running on the Raspberry Pi. Once that’s done, DH Box will also be runnable on nearly any Linux computer! We are also targeting OS X.

Who do you think will use DH Box?

Anyone and everyone who is interested in learning Digital Humanities inquiry techniques, but especially those who may not have any prior programming experience. We hope that instructors will use our tools to set up almost instant DH labs, and that students will use DH Box to get an edge in their research.

We see DH Box as an example of what is likely to be a robust and interesting future field, ‘humanities hardware’.

Who are we?

We are an interdisciplinary team of learners and do-ers, librarians and developers and digital humanists and more — with an interest in making DH work more accessible. Find us:

More to come as we continue to develop DH Box!

Refining our focus and finding connections

The DH Box team has been working hard on defining the scope for DH Box and setting up our project plan. We’ve started using Asana as our project management tool. As the project manager, I’m really enjoying Asana. It’s flexible, easy, and it allows our team to collaborate on building the plan as we go. It’s also very nice that it tracks everything and sends out plenty of reminders!

Our scope has been narrowing down as we refine our concept of DH Box. We are thinking more about who will use DH Box and thinking about the best way to make it a valuable toolkit for introductory students in digital humanities classes.

Pedagogy is a key part of the digital humanities at the CUNY Graduate Center and the Praxis Network. Our focus for the first phase of development will be text analysis and topic modeling including key tools such as MalletNatural Language Toolkit (NLTK), and the Stanford Named Entity Recognizer. We are going to build an interactive textbook using IPython Notebook. The textbook will be bundled with the DH Box install scripts and it will help orient students with the tools through interactive code execution. We have also thought more about our platform and what would be most useful for our users. We are going to make DH Box available for download not only for Raspberry Pi but also for Linux, Mac, and hopefully Windows.

As we have narrowed down our scope, we are also discovering a much wider range of connections to the DH community. Our professor, Matt Gold, has put us in touch with his colleague Dennis TenenGC Digital Fellow  Micki Kaufman suggested we check out Ian Milligan’s work and we’ve found amazing stuff in Big Digital History: Exploring Big Data through a Historian’s Macroscope, a co-written manuscript by Shawn Graham, Ian Milligan, and Scott Weingart. My library colleague Roxanne Shirazi, who edits the dh+lib blog, suggested we check out an idea for a project called DH creator stick which George Williams proposed at THATCamp Piedmont 2012 (see also a blog post by Mark Sample).

We’re amazed by the range of rich ideas we are beginning to discover. We hope to reach out to the DH community and ask for advice and feedback as DH Box takes shape.

DH Box: Tackling Project Scope

We have this great Digital Humanities project idea, but what happens between now and launch time?

With an idea like DH Box (a customized linux OS with preinstalled DH Tools and the flexibility to operate on a computer as cheap and portable as the Raspberry Pi) there are a number of directions we could take, and will certainly consider for further iterations of DH Box beyond the Spring term (this blog currently documents the experiences of a project team enrolled in a graduate course in Digital Humanities Praxis at the Graduate Center, CUNY).

In order to refine the scope of our tool, we asked ourselves some questions:

  • What approach will we take around educating users about coding, the infrastructure around the DH Box software, hardware, and operating system?
  • Which DH Tools should we include? See Alan Liu’s curated list for more info on the scope of DH tools out there
  • What user(s) are we building this for?

The success of our project hinges on our ability to carefully model the scope of the tool by shaping the answers to these questions . . . all by May 12th (public launch date)!

Educational Value

Beyond providing a collection of accessible DH Tools, we want DH Box to help bridge knowledge gaps by delivering a strong educational component. We’d like for instance, undergraduate English students to gain exposure and develop proficiency in Digital Humanities inquiry through the kind of guidance and practical experience DH Box will offer. To that end, we will begin an interactive textbook to provide instruction about the specific tools included in this first iteration of DH Box. We are most inspired by the Learn Code the Hard Way interactive textbook series by Zed Shaw.


We are gearing this version of DH Box to bring Topic Modeling and Text Analysis to Humanities students!

We began by considering the most popular DH Tools out there and quickly realized it made a lot of sense to whittle the list down for this current project phase. We’ve made choices based on optimal software performance with the Raspberry Pi. We also want to provide DH Tools that haven’t yet had the level of proliferation like some of the more popular content management systems such as WordPress.


Undergraduate Humanities students currently have little familiarity with terms like tokenizationsentiment analysis, etc., and how these components of text analysis can open expansive modes of textual inquiry. As part of its mission, DH Box will work to make these methods accessible to a broad audience!

Stay tuned for exciting updates on implementing the install scripts, using IPython Notebook, and more!


Questions? Comments? Tweet us!