It’s the Content, Stupid.

I teach workshops on library databases to a range of users throughout the year at the New York Public Library. Some of the walk-in students are academics, others are unaffiliated scholars, and many more are undergraduate or graduate students from nearby schools. The degree to which they’re familiar with platforms, searching, Boolean logic, peer-review, and formats varies. But one thing all the students share is general confusion as to which database they should use for the kind of research they’re conducting.

The database vendors don’t help: Readex? Never heard of it. ProQuest? Sounds vaguely familiar. And the database names—Academic Search Premier, Ulrich’s, Project Muse— are opaque. Yes, some exact titles, like The New York Times or Chicago Defender, can steer the user in a general direction, but without a greater understanding of the kind of content that can be found in each resource, the user is left to fend for his or herself. And that usually means Google. While Google is not an inherently bad choice, especially for initial research queries, many beneficial subscription resources are left unexplored.

Take online reference databases—in the past, a question asked at the information desk often resulted in a librarian directing the user towards the section of the physical reference shelf where one might find sources to help. Today, much of that reference shelf has moved online to platforms like Credo or Gale Virtual Reference Library. The online sources may provide 24/7 access to information, but finding relevant titles is often more difficult.

Theoretically, that’s where discovery platforms like Summon and EBSCO Discovery Service  come in. Discovery platforms search the metadata of nearly all the library’s subscription resources simultaneously so users don’t need to visit each database individually. But they are only helpful if the service your library subscribes to indexes the databases that you need. EBSCO Discovery Service, for example, doesn’t index ProQuest products, and vice versa. Therefore, if you’re using EBSCO for a search on historical newspapers or periodicals, your results will be greatly limited.

Perhaps it’s no surprise, then, that 97% of academic library directors surveyed in the recent Ithaka S+R survey cite teaching informational literacy to undergraduates as an important function of the library. With such limited transparency of online sources, undergraduates clearly need all the help they can get when starting their research.

The Beyond Citation team hopes that researchers—both seasoned and amateur—will shine the light on databases they use regularly by examining the database’s strengths, weaknesses, and the overall range of material. In other words, the content. Because without a better understanding of the troves of rich information discoverable in each database, they’re all just links on a page.

We are at blog.beyondcitation.org. Email us at BeyondCitation [at] gmail [dot] com or follow us on Twitter @beyondcitation as we get ready for the launch in May.

 

DH Box considers deployment options

Cross-posted from the DH Box Blog: https://dhbox.commons.gc.cuny.edu/blog/2014/deployment-options-dh-box


Once DH Box knew the platform it would adopt, it was simply a matter of figuring out the best way to utilize that platform. But was it so simple?

What the DH Box Team has been tackling this week is striking a balance between providing a robust tool that is useful for the intended audience and whose maintenance is not insurmountable for its administrators.

To recap — the platform chosen for delivering the DH Box environment, ready with DH tools installed, is a web server image provided through Amazon’s AMI (Amazon Machine Image) appliance. This will deliver, in essence, an identical copy of a tool-laden operating system to any user’s system.

Choosing this platform offered important benefits — for example, freedom from having to address issues caused by tools being installed to users’ personal systems. However, it also introduced tension: to deploy images hosted by Amazon, one needs to use an Amazon account. Would we have users create their own Amazon Web Services (AWS) accounts that require credit card information (though launching the Image is a free service) or would we maintain an account that instances would be launched from and figure out how the DH Box team would handle potential related charges?

Many questions entered into this equation: Would our intended users be open to providing credit card information? Who might this alienate? Or, if we managed the AWS account with many instances running, would we incur charges we’re not prepared to deal with? What would be the time-period allotted to users for running the instances?

DH Box has had to think through how different deployment options (e.g. requiring users to have their own AWS accounts) might affect how DH Box will be adopted by intended users. And this — the tension between providing a service that is maintainable, sustainable, and at-once useful to the intended audience — is something any project like DH Box might face.

User experience testing and documentation

DH Box is really taking shape! We have a bare bones version of our server image up and running thanks to all of Steve’s hard work over the last week. We have revised our project plan with new milestone dates and a clear cut set of tasks we need to accomplish. We are working hard on everything we need to do now and also looking forward to the next phase.

User experience testing and documentation will be very important over the next few weeks. We need to be sure that people who are not already familiar with the command line, cloud computing, and DH tool installation will find DH Box easy and convenient to use. Documentation (aka the “user manual”) will be the key to helping users make the most of DH Box. We have decided to use Read the Docs  to host our documentation. Read the Docs allows us to host documentation files on our website and update our documentation when pushing to the GitHub repository that hosts our website – this means updating our online documentation is as simple as updating text on our website! One great benefit of using a utility like Read the Docs is our documentation will be easily maintainable, will be forkable by contributors, will be available online, and will be searchable.

Thinking About Authority and Academic Databases

Beyond Citation hopes to encourage critical thinking by scholars about academic databases. But what do we mean by critical thinking? Media culture scholar Wendy Hui Kyong Chun has defined critique as “not attacking what you think is false, but thinking through the limitations and possibilities of what you think is true.”

One question that the Beyond Citation team is considering is the scholarly authority of a database. Yale University Library addresses the question of scholarly authority in a handout entitled the “Web vs. Library Databases,” a guide for undergraduates. The online PDF states that information on the web is “seldom regulated, which means the authority is often in doubt.” By contrast, “authority and trustworthiness are virtually guaranteed” to the user of library databases.

Let’s leave aside for the moment the question of whether scholars should always prefer the “regulated” information of databases to the unruly data found on the Internet. While Yale Library may simply be using shorthand to explain academic databases to undergraduates, to the extent that they are equating databases and trustworthiness, I think they may be ceding authority to databases too readily and missing some of the complexity of the current digital information landscape.

Yale Library cites Academic Search and Lexis-Nexis as examples of databases. Lexis-Nexis is a compendium of news articles, broadcast transcripts, press releases, law cases, as well as Internet miscellany. Lexis-Nexis is probably authoritative in the sense that one can be comfortable that the items accessed are the actual articles obtained directly from publishers and thus contain the complete texts of articles (with images removed). In that limited sense, items in Lexis-Nexis are certainly more reliable than results obtained from a web search. (Although this isn’t true for media historians who want to see the entire page with pictures and advertisements included. For that, try the web or another newspaper database). Despite its relatively long pedigree for an electronic database, careful scrutiny of results is just as crucial when doing a search in Lexis-Nexis as it is for an Internet search.

In some instances, especially when seeking information about non-mainstream topics, searching the Internet may be a better option. Composition and rhetoric scholar Janine Solberg has written about her experience of research in digital environments, in particular how full-text searches on Amazon, Google Books, the Internet Archive and HathiTrust enabled her to locate information that she was unable to find in conventional library catalogs. She says, “Web-based searching allowed me not only to thicken my rhetorical scene more quickly but also to rapidly test and refine questions and hypotheses.” In the same article, Solberg calls for “more explicit reflection and discipline-specific conversation around the uses and shaping effects of these [digital] technologies” and recommends as a method “sharing and circulating research narratives that make the processes of historical research visible to a wider audience . . . with particular attention to the mediating role of technologies.”

Adding to the challenge of thinking critically about academic databases is their dynamic nature. The terrain of library databases is changing as more libraries adopt proprietary “discovery” systems that search across the entire set of databases to which libraries subscribe. For example, the number of JSTOR users has dropped “as much as 50%” with installations of discovery systems and changes in Google’s algorithms. Shifts in discovery have led to pointed discussions between associations of librarians and database publishers about the lack of transparency of search mechanisms. In 2012, Tim Collins, the president of EBSCO, a major database and discovery system vendor, found it necessary to address the question of whether vendors of discovery systems favor their own content in searches, denying that they do. There is, however, no way for anyone outside the companies to verify his statement because the vendors will not reveal their search algorithms.

While understanding the ranking of search results in academic databases is an open question, a recent study comparing research in databases, Google Scholar and library discovery systems by Asher et al. found that “students imbued the search tools themselves with a great deal of authority,” often by relying on the brand name of the database. More than 90% of students in the study never went past the first page of search results. As the study notes, “students are de facto outsourcing much of the evaluation process to the search algorithm itself.”

In addition, lest one imagine that scholars are immune to an uncritical perspective on digital sources, in his study of the citation of newspaper databases in Canadian dissertations, historian Ian Milligan says that scholars have adopted the use of these databases without achieving a concomitant perspective on their shortcomings. Similarly to the Asher et al. study of undergraduate students, Milligan says, “Researchers cite what they find online.”

If critique is, as Chun says, thinking through the limitations and possibilities of what we think is true, then perhaps by encouraging reflective conversations among scholars about how these ubiquitous digital tools shape research and the production of knowledge, Beyond Citation’s efforts will be another step toward that critique.

We are at blog.beyondcitation.org. Email us at BeyondCitation [at] gmail [dot] com or follow us on Twitter @beyondcitation as we get ready for the launch in May.

Travelogue: Format Selection and Other Updates

The team chose the ESRI ArcGIS Storymaps platform for the Travelogue project.  Last week the team had a vote on which ESRI ArcGIS Storymaps format to go with, the options were:

Sequential, Place-based Narratives Map Tour http://storymaps.arcgis.com/en/app-list/map-tour/

A Curated List of Points of Interest Short List http://storymaps.arcgis.com/en/app-list/shortlist/

Comparing Two or More Maps Tabbed Viewer  http://storymaps.arcgis.com/en/app-list/tabbed-viewer/

Comparing Two or More Maps Side Accordion http://storymaps.arcgis.com/en/app-list/side-accordion

A Curated List of Points of Interest Playlist http://storymaps.arcgis.com/en/app-list/playlist

The winner was…Map Tour http://storymaps.arcgis.com/en/app-list/map-tour/

Each team member has an Esri ArcGIS organizational account that can be used to practice and publish.  With the format selected and a large volume of research content done we can now start building.  The American authors that we have chosen to initially feature are Zora Neale Hurston and Ernest Hemingway.  We have shared Google Drive folders for each that feature spreadsheets with the research collected so far.  The spreadsheet entries are organized with a unified chronological date so that the journeys can be mapped chronologically.  All of the locations on both spreadsheets also have coordinates.

Informational text about each author is being written and audiovisual material to be featured on the Travelogue site is being collected.  Notably, direct links to Hemingway images from the JFK Library’s Media Gallery http://www.jfklibrary.org/JFK/Media-Gallery.aspx For the content sources we have chosen to use the MLA citation format.

The Travelogue’s Twitter account has received a few new followers.  Also, a Travelogue tweet was favorited by a San Francisco Chronicle newspaper Book Editor (all acknowledgements count).  The Twitter logo has been redesigned.  The look of the Twitter page has been updated to reflect the biblio and cartographic aspects of the project. Check it out @dhtravelogue

The team is looking forward to providing a status update presentation to the DH Praxis class on Monday, March 24th.

If you want to contact us please do. Our project blog is at  travelogue.commons.gc.cuny.edu. Email us at dhtravelogue [at] gmail [dot] com or follow us on Twitter @DhTravelogue

Finding a Home: Travelogue Picks a URL

The Travelogue team has been navigating the URL waters (travel puns abound but URL names do not).  By Monday, March 17th the URL had been decided upon and purchased.  Details soon to follow (we will let you know when to begin the drum roll).

Other updates: On the Travelogue’s Commons page the Twitter feed has been updated removing the icons and making it more text based.  The team is also choosing between paper texture images to be used for the Travelogue’s Commons site background, consulting with guides on 2014 web design trends.  We have been actively working on the Zotero citations for the content that will be featured on the Travelogue site.  Meet-ups outside of normal class hours have been scheduled.  We have been outlining the research that has been done so far and what needs to be worked on.  Zora Neale Hurston and Ernest Hemingway are the two American authors that the Travelogue project will initially focus on.  Research wise, we are currently working on historical context, researching what was going on in the locations that they traveled to during their time there.

If you want to contact us please do. Our project blog is at  travelogue.commons.gc.cuny.edu. Email us at dhtravelogue [at] gmail [dot] com or follow us on Twitter @DhTravelogue

[Cross-Post] Maintaining Documentation is Important – Examples of Online Platforms | Harlan Kellaway

Cross-posted from: http://harlankellaway.com/2014/03/17/online-documentation-platforms/ (March 17, 2014)


Documentation for technical projects — especially ones you envision having non-technical end users — is essential. This statement comes from various experiences, both as a user and a developer who has both produced documentation for such users and has taken over poorly-documented projects from other developers. Having good documentation cuts down on the endless issues that come with figuring out how to use a technical product, whether from the frontend or the backend.

The reason documentation is on my mind is that one of the current projects I’m doing development for, DH Box, is one targeted towards end users of the variety described above. Not only that, but one of the systems integral to the project is a notoriously opaque one: Amazon Web Services. Moreover, it’s a mission of mine in my personal work to explore how tech can be made more accessible to those who don’t deem themselves technically savvy (I consider the technical/non-technical social binary and contributing factors to be problematic — but I’ll save that exploration for a different post!).

This week, I explored some tools that could help maintain the online documentation for our DH Box project, with a few preferences in mind: documentation that is easily updatable, documentation that is browsable and searchable, documentation that is configurable (e.g. in how it looks).

Continue reading

[Cross-Post] Subtle Elements of Online Branding That Can Help the Impression You Make in the Web World | Harlan Kellaway

Cross-posted from: http://harlankellaway.com/2014/03/02/online-branding-subtle-elements-to-help-web-impression/ (March 2, 2014)


I found myself in the midst of an interesting exchange a couple weeks ago — it was over whether Twitter handles should have underscores in them or not. The person questioning the underscores had been told to not use them period, but was unclear on why. It occurred to me that there are many subtle forms of online branding that folks who work on/with the Internet quite a bit eventually pick up, things that are obscure to more general users.

Here are a few examples of such online branding to think about when building an online presence.

Continue reading

[Cross-Post] Website Development for Beginners – Trade-offs Between Jekyll and WordPress | Harlan Kellaway

Cross-posted from: http://harlankellaway.com/2014/02/20/web-development-beginners-tradeoffs-jekyll-wordpress/ (February 20, 2014)


I’ve used WordPress for years as the base for just about every site I’ve created. It had occurred to me occasionally that WordPress may not be the best choice in every website development case, whether big or small, simple or complex. But, honestly, the idea of creating a website that didn’t have a database behind it hadn’t even occurred to me. A website that is almost purely HTML? I had subconsciously equated this with broken links and scrolling banners. I had equated database-backing with words like easy, maintainable, modern, and extensible.

It’s a clear case of every problem looking like a nail when the only tool you have is a hammer. But what if your problem is a thumbtack? Or a staple? A hammer might work, but it’s most definitely overkill and can even limit your creativity in solving the problem of fastening paper to things.

In terms of website development, WordPress has been my hammer and it’s gotten the job done. But, I had the opportunity the past weeks to use a new tool — Jekyll.

Continue reading

DH Box: Making a place in the Cloud

The DH Box team has made exciting strides over the past week!

As some may know, DH Box will be available on a pre-installed, pre-configured Debian cloud server. To achieve this, we are using Amazon Web Services. For those who aren’t initiated, AWS is a vast cloud computing infrastructure (with internet servers throughout the world) that offers services very similar to what a physical computer would. But AWS brings unrivaled scale, flexibility, and economy (pay as you go pricing).

DH Box’s Intro to Cloud Computing

Dennis Tenen led the DH Box team through it’s first group workshop on setting up a virtual web server image, a.k.a. an “EC2 Instance.” The virtual web server contains an Amazon Machine Image (think of it as an identical copy) of an operating system. DH Box will be freely available for users to launch their own instance of ours. This solution saves users the trouble of downloading and installing tools to their own computers.

What do users need to access DH Box in the Cloud?

It will be pretty simple- users must sign up for a free AWS account. And we’re making use of AWS’ CloudFormation (templates that deploy services rapidly) utility to automate many of the steps required to launch a new AMI instance. We also have custom scripts to automate the launch of DH Box files and software once users copy our server image. We’re really excited about being introduced to this powerful service, and even more encouraged that our configuration templates will allow DH Box users to dive swiftly into DH inquiry.

This is just the beginning- we’re focusing heavily on providing thorough documentation so that DH Box users will have everything they need to get up and running. Stay tuned!

Special thanks to Prof. Dennis Tenen for his amazing Intro to Cloud Computing Workshop.