Abstract Analysis & Visualization: History in TCQ and JBTC

In this post, I decided to experiment with text visualization software to reveal patterns. Being that I’m interested in the history of technical communication, I first selected two major journals centered on technical communication:

  • Technical Communication Quarterly
  • Journal of Business and Technical Communication

I accessed TCQ via EBSCO Host and JBTC via Sage Premier. From there, I searched within both journals for “history” as a term appearing only in abstracts.

I then exported that data into two .txt files (one for each journal) and saved them on my local drive. A quick Google search revealed a powerful browser-based tool called Textexture which allows you to upload any .txt, .rtf, or .pdf files. Textexture treats each file as a corpus of text data, runs an algorithm to find most commonly used words, and visualizes them as a netowrk.

I uploaded .txt files from each journal and ran them on Texture. I then changed the settings to make the visualizations public. Below, you will see two different types of visualizations for each journal: a visual summary, and a different type called polysingularity. I admittedly don’t know much about the polysingularity visualization, but I decided to embed those visualizations anyway.

Technical Communication Quarterly

There are 24 total article abstracts featured in this corpus and visualization. The dates of the articles featured range from spring 1994 to June 2014.

Visit this link for a more substantive version of these visualizations, including quantitative data like most influential key words and contexts.

Journal of Business and Technical Communication

There are 20 total article abstracts in this text corpus. The dates of articles in this corpus range from July 1991 to September 2014.

Visit this link for a more substantive version of these visualizations, including quantitative data like most influential key words and contexts.

Analysis

There’s much that can be learned from the data visualized here, but there are also limitations. Ideally, this contrastive, visual analysis would allow one to compare the term “history” as it appears in the abstracts of both journals. This can be useful in helping to understand how each journal has discussed “history” in different ways, simply by looking at the common terms connected to the node “history” for each journal. It took me some time to orient myself to these visualizations and what they represent. When scrolling over a specific node, only the connections with that node appear in the visualization. The term “history,” for example, appears to have a broader and more diversified set of connections in JBTC than in TCQ. JBTC also appears to have many more nodes in total — and a much more diverse range of nodes — whereas TCQ is a bit hindered by the metadata that came along with exporting the abstract data (i.e., Taylor, copy, warranty, accuracy, etc.). These nodes appear in a mustard color.

There are obvious limitations to exporting abstract data only, especially being this sort of metadata cannot be filtered out unless I manually cleaned up the .txt files prior to running them on Textexture. Nonetheless, this was an enjoyable and useful experiment that I learned a great deal from. If I had more time, and if I tried this again, I would gather a more substantial corpus in cleaned-up .txt files and run them that way. I would also gather data from other journals in technical communication, such as IEEE Transactions in Professional Communication and the Journal of Technical Writing and Communication.

I posit that this analysis could reveal trends in written scholarly conversations about history and historiography in technical communication. A contrastic analysis could also reveal that some of these journals could be operating as silos preventing conversations from crossing between and across journals. Using another text visualization tool like Voyant could also provide an added element to this analysis and help me find more patterns (or see them more easily).

A similar analysis of scientific abstracts could also reveal patterns like rhetorical moves. I think this would be especially valuable to learn about how scientists present technical information about the environment or risk to public audiences, for instance. It would also be an interesting and telling experiment to analyze actual historical texts of interest to technical communication scholarship. For example, environmental impact reports centered on a geographical site from the 1960s onwards could be ran; I believe these visualizations could reveal the common terms and the complexity of public environmental assessment over time. If available, a similar historical approach could be taken with coal mining rescue manuals, seismic reports and warnings, and public health reports centered on nuclear risk. These data, then, could not only help find patterns in the history of public environmental and health reports, but also point to ways that those reports can be more clear and consistent for public audiences.