Text Analyzer - May 2017

Feedback, Comments, Mentions, Questions on Twitter, Email, and Facebook

  1. Text Analyzer Beta, from JSTOR Labs, a useful utility for analyzing text and topics:  http://buff.ly/2q4S7yh 
  2. A cool search tool just got cooler. @JSTOR Text Analyzer makes it easier to cite search results or add to a list.  

https://t.co/QbwQJx9RPP https://t.co/aneCwQTlwa
    A cool search tool just got cooler. @JSTOR Text Analyzer makes it easier to cite search results or add to a list.  http://www.jstor.org/analyze  pic.twitter.com/aneCwQTlwa
  3. Hi Alex,

    Thanks for getting in touch. This morning, the 500 error is no longer appearing when accessing the Text Analyzer feature. However, when I dropped the sample document for analyzing, the system has been on the Analyzing Text and Identifying Topics stage for several minutes. Students are barely willing to wait ten seconds for something to happen! I’ll be glad to test and provide feedback if you need.

    --email from college librarian in South Carolina, May 3 10:01
  4. Thanks [redacted]! I'm glad that one problem is gone but less so that it's been traded for a second problem. That time to process seems too long -- I suspect there was an issue which I'd like to dig into. Would you mind emailing us the document you were trying, so we can see if the problem was with the doc itself?

    Thanks much for helping us make this better.

    --reply to SC librarian from Alex, May 3 10:29
  5. Tried JSTOR analyzer on my "Dinny Gordon" article. Most heavily weighted terms: "Anne Emery" (author) & "Brad Kenyon" (D's love interest).
  6. Cool! Are these the terms you'd have expected, or would you want/need different ones? (We're always trying to make it better...)  https://twitter.com/jillian6475/status/859803368541360128 
  7. @abhumphreys Some were useful, some were really not ("intellectual property law"). Also a way to limit yrs of publication would REALLY help.
  8. @abhumphreys Just limiting to "current" isn't useful to me b/c I'm a historian & my specific topic doesn't work well w/JSTOR except for primary sources.
  9. @abhumphreys Also puzzled that an article abt a fictional girl (whose name was EVERYWHERE in article) yielded fictional bf's name but not *hers*.
  10. @jillian6475 Got it! Makes perfect sense. We've actually been considering exactly this feature. Can't promise when, but hopefully not too far off...
  11. @abhumphreys I mean, articles abt friendship in schools would be great, but the current ones aren't relevant -- I need ones from 1940s-1960s :)
  12. @jillian6475 That's interesting. Her name is "Dinny Gordon"? I wonder if the tools we use to identify people hiccuped on the name... I'll look into it.
  13. @abhumphreys Yep. Article is "Dinny Gordon, Girl Intellectual" +subtitle. It's abt series of 4 bks called "Dinny Gordon [high sch yr]" +
  14. @abhumphreys "Dinny" came up as a suggested option; it was just funny that one of her 3 love interests turned up as 2nd heaviest weighted option.
  15. @abhumphreys And I had to laugh b/c that bf is clearly figured as future husband, other 2 bfs were portrayed as jerks. Loved that y'all dumped them too!
  16. @jillian6475 Yes, Text Analyzer is all about passing moral judgments on the characters in the texts it analyzes.
  17. @abhumphreys Actually, that would be kind of a cool tool? Imagine if you were writing an article about villains! <3
  18. @abhumphreys Seriously though, interesting tool -- I'm a libn & wanted to test it out to see abt adding "how to use this" to resources for historians :)
  19. @jillian6475 I know, right? We'll get on that right away: "Text Analyzer. Now with the ability to tell Good from Evil."
  20. @abhumphreys I like that you can fiddle w/the weighted terms and add your own. It's a good way to get stdts et al. to think about choosing search terms +
  21. @abhumphreys and also about how to summarize your own work and others' work as well (ideas for teaching note-taking etc.).
  22. @jillian6475 Wonderful, thanks! I'll let you know if/when we add a date filter. Let us know if you think of other ways to improve or use it!
  23. @abhumphreys It also strikes me as an interesting way to incorporate subject searching into JSTOR (which is a sticking pt for me w/JSTOR).
  24. @abhumphreys Also, could imagine a class exercise where stdts put in article they'd read for class and then discussion w/the question you asked me +
  25. @abhumphreys "Are these the terms u expected? Are there other ones you might have thought of or used?"Gets at how to summarize, how to think of keywords.
  26. @jillian6475 Love this idea! Tagging @HowArtHistory, who works to help educators use @JSTOR (and @Artstor), so she can see it.
  27. @jillian6475 Back to original topic. Would u mind pointing me at or DMing me the article you were trying? We want to dig into the "Dinny" problem...
  28. @abhumphreys Sure! It's actually not available in JSTOR -- dragged it from Zotero into the analyzer (good 2 know tht works!):  https://muse.jhu.edu/article/545839 
  29. @jillian6475 thanks!! I'll let you know if we discover anything.
  30. @abhumphreys Thanks! :) (it's actually very timely for me to revisit this article... good entry into what I'm trying to write now!)
  31. @abhumphreys @jillian6475 @JSTOR @Artstor Thanks, @abhumphreys! Now I am inspired to use fiction. I have been on rhetoric, like MLK's speeches for a dif. angle on the scholarship.
  32. @HowArtHistory @abhumphreys @JSTOR @Artstor Love teaching w/fiction. I teach dating-history course w/wk on hist teen fic--excprt frm 1952 boys bk w/a date rape  http://research.library.gsu.edu/datinghistory 
  33. @HowArtHistory @abhumphreys @JSTOR @Artstor Last time I taught it, student wondered out loud how/if the book had been reviewed (great question!) Turned out Kirkus had :)
  34. @HowArtHistory @abhumphreys @JSTOR @Artstor Thx! Alas, found out ystrday that it's not bn renewed for Fall 2017 & I have to wait until Fall 2018. Had a lot of ideas abt revising it....
  35. It would be cool if you could sort the results by publication in order to see if there were particular journals publishing a number of articles in same research area.

    --email from librarian in Massachusetts, May 3 15:21
  36. Dear [redacted],

    Thanks for the suggestion -- that sounds super-helpful. Would you mind if I asked you a follow-up question? What would you use that information for? The more that we can understand what you’re trying to do, the better we’ll be able to design this.

    Thanks much!

    --reply to Massachusetts librarian from Alex, May 3 15:25
  37. Hi, Alex,

    I'd be delighted to answer your follow-up question. I was thinking that it could be a useful tool to help students and faculty identify journals to which they might want to submit an article--they could put in a draft or an abstract and then sort to see if there were journals that had published multiple related articles in recent years.

    It could also be good for novice researchers wanting to get the lay of a land in a particular field--they could put in an article they found helpful, or a paper of their own (a dissertation or thesis prospectus, maybe) and then sort by journal to find out if there were particular publications that they wanted to keep tabs on, based on the articles they found there.

    Thank you for asking!

    --followup from librarian in Massachusetts, May 3 16:15
  38. Thanks, [redacted], those are two excellent possible uses! If/when we’re able to support them, I’ll be sure to let you know.

    If you think of any other great ideas, please let us know!

    --reply to Massachusetts librarian from Alex, May 3 16:27
  39. .@JSTOR Text Analyzer now lets you cite search results or add them to a list. Because that's what friends are for.

https://t.co/QbwQJxrtep https://t.co/zdcChP6tGR
    .@JSTOR Text Analyzer now lets you cite search results or add them to a list. Because that's what friends are for.  http://www.jstor.org/analyze  pic.twitter.com/zdcChP6tGR
  40. @abhumphreys @JSTOR Just don't make a wikipedia citation format, please. We can't accept searches as a citation except in some special cases.
  41. @abhumphreys @JSTOR My bad--the citation is next to each result. Disregard!
  42. @abhumphreys @JSTOR My guess is you were clear enough for anyone not primed to think about the horror of preformatted search citations. ;)
  43. @abhumphreys I mentioned that an analysis of 2 articles might be helpful (e.g. an article and one citing it). Might be more tractable than imp. a biblio.
  44. @abhumphreys You could also (w 2 rather than many) give options to find/condition on only common elements, only uncommon elements, etc.
  45. I was using the sample document provided on the Text Analyzer interface. Today it gives me an error when processing: [screenshot]

    --followup email from college librarian in South Carolina, May 5 8:22
  46. Thanks, [redacted]. This is very helpful feedback.

    Based on this screen shot it appears you’re using Firefox on a Mac. That combination should work so the problem is likely related to network connectivity, which generally points to bandwidth or a proxy problem.

    Did the error dialog appear immediately or did it take a minute or 2? If it appeared quickly it’s likely not a bandwidth issue.

    It looks like you’re accessing JSTOR and Text Analyzer via a proxy of some sort. I suspect this error is related to that. We’ve made some recent changes that addressed some proxy compatibility issues but it’s quite possible there are still some yet to be resolved. Would you by any chance know what proxy software is being used here? OCLC’s EZProxy, perhaps, or something else?

    --reply to SC librarian from Ron, May 5 8:45
  47. The error appeared very quickly. We’re using EZProxy and having tested both with and without, I can confirm that the error does not occur when I am on campus and avoid routing through the proxy server. Hope that helps.

    --followup email from college librarian in South Carolina, May 5 9:01
  48. Thanks so much for the quick feedback. Yes, that helps a lot. It confirms the proxy incompatibility hypothesis. We have a local EZProxy instance here that we use for development and testing. It’s evidently configured differently than your setup.

    You’ve been very helpful already. At the risk of appearing pushy here I have a follow-up question and a small request.
    · When trying to use Text Analyzer from the proxy have you been successful with other documents, including uploading from your local computer or copy/paste or drag/drop URLs other than the sample document we provide?
    · If you’re familiar with the browser developer console I’d be very interested in knowing what kind of messaging appears in the console when these errors are encountered.

    --reply to SC librarian from Ron, May 5, 9:33
  49. I was successful at uploading a pdf file from my computer without the proxy, but with the proxy the error occurred consistently.

    Our EZProxy is version 5.7.44 and the JSTOR configuration stanza is:

    Title JSTOR
    HTTPHeader X-Requested-With
    Option DomainCookieOnly
    Title JSTOR
    URL jstor.org/
    HJ jstor.org
    HJ jstor.org
    HJ jstore.org
    HJ jstor.org
    HJ jstore.org
    HJ links.jstor.org
    HJ mobile.jstor.org
    HJ about.jstor.org
    HJ plants.jstor.org
    HJ uk.jstor.org
    DJ jstor.org
    Option Cookie

    I use Firebug for my development – here are the alerts I get: [screenshot]
    and to contrast, without the proxy: [screenshot]

    --followup email from college librarian in South Carolina, May 5 13:11
  50. This is fantastic! Thanks so much for sending me this info. That Firebug screenshot provides some real clues. Based on that I’m going to try something here and may try to impose on you one more time to give it another try with your proxy.

    --reply to SC librarian from Ron, May 5 13:34
  51. Glad to help! Let me know what you need.

    --followup email from college librarian in South Carolina, May 5 13:34
  52. I was able to check on the TextAnalyzer from a different computer today and it worked. I will check my work machine when I get back to the office and see if the issue is resolved there as well.

    --email from librarian in West Virginia following up on problem during webinar, May 5 15:09
  53. Thanks so much, [redacted]. We really want Text Analyzer to work every time, for everyone, and your help in identifying when we’re NOT doing that is super-helpful.

    --reply to WV librarian from Alex, May 5 15:11
  54. I checked from my work machine today and the site loaded.
    --email followup from librarian in West Virginia, May 6 15:31
  55. That’s great to hear, [redacted]. Thanks for letting us know. Enjoy, and please don’t hesitate to reach out to me if you encounter any more issues, or have any suggestions on how to make Text Analyzer more useful to you and your students.

    --reply to WV librarian from Alex, May 8 8:47
  56. @eddobento We are looking into it! Thanks. In the meantime, try Text Analyzer, same function with better results:  http://www.jstor.org/analyze/ 
  57. Things have improved since we spoke.
    I’ll keep trying different files and see if I can recreate the issue.

    If I am able to follow the instructions you provided, I’ll be sure to send screen shots.

    Thanks for your help!

    --email followup from HS librarian in Texas, for whom Text Analyzer didn't work during an intercept survey, May 9 13:32
  58. *god bless the 21st century* JSTOR is offering a text analyzer that looks at what you wrote and reccs more articles, and it's sorta shoddy
  59. @dirtybags31 Hi there! What would make it less shoddy and more one of the *legitimately* awesome parts of the 21st century? What wasn't great for you?
  60. @abhumphreys the text analyzer correctly assumed that i used Jill Mann in my paper as a point but gave it too much weight--but i liked how i could adjust
  61. @abhumphreys the weight that jill man was given + include more phrases for the analyzer to gather more pertinent articles, which i used to find 2 papers
  62. @dirtybags31 That's great that you were able to tweak the terms to get relevant and useful results. So: it over-weighted Jill Man and missed a few terms?
  63. @dirtybags31 we'll keep working at it! If you're willing to send me your document we'll be happy to look more closely to figure out how to make it happen
  64. I really, really like this app! I just have one small question: why can't the results list be part of the main page instead of in a frame? It's harder to scroll through than it would be if the page just extended all the way to the bottom with results, particularly on large monitors.

    Thanks so much!

    --email from history professor in the UK, May 11 18:02
  65. Thanks, we’re glad you like it! The scroll frame with results can indeed be frustrating, so that’s a good suggestion, and as Text Analyzer grows and features are added, we’ll consider switching to that. It’s really helpful to hear from you.

    --reply to UK history professor from Jessica, May 11 18:09
  66. Just saw this cool video review of Text Analyzer. Some great points, @Libsaurus: we're working on many of them now!  https://www.youtube.com/watch?v=l4OiWpLEh38 
  67. Thanks for the feedback! We’re so glad to hear you like the app, and your suggestion about making the results panel more useful / larger makes a LOT of sense. We’re working actively right now to make the whole tool more useful, so feedback like yours helps us sort through and prioritize all of our ideas.

    Thanks again, and don’t hesitate to reach out to us if you think of other ways to make this more valuable to you.

    --reply to UK history professor from Alex, May 11 18:12
  68. @abhumphreys Great to hear! Looking forward to seeing the next stage!
  69. @dirtybags31 Dm me and I'll give you an email address to send it to?
  70. To Whom It My Concern
    Attached is the final paper I wrote for my undergraduate career. I used JSTOR text analyzer to find a couple of sources to anchor my paper to an ongoing discussion that is being had regarding debates found in thePearl Manuscripts. When I "plugged" my paper into the text analyzer the results I got heavily weighted terms like "Jill Mann." Although this was by no means unhelpful, I had to insert terms like "Pearl poet" and "Pearl poem" myself in order to confine the search to find more pertinent results.

    --email from a recent graduate in New Jersey, May 11 20:18
  71. @Libsaurus Do you still have that page image u used? We improved the algorithm soon after you did this and I'd love to see if it made a difference.
  72. @Libsaurus That said, the tool in general works better the more text you throw at it.)
  73. Dear [redacted], thanks for sharing this with us and for letting us know what terms you were especially interested in adding! Examples like this are exactly what we need to improve the Text Analyzer.

    --reply to recent NJ graduate from Alex, May 12 10:10
  74. @abhumphreys Sure do! I just got off a plane, so I'll have a bit of a play later on 😁
  75. Dear JSTOR. I'm a long-time fan, current undergraduate at Wesleyan University. Would love to get in touch with the people who are currently working on this technology - I'm looking to go into graduate school in Artificial Intelligence, with a concentration in Natural Language Processing, and this new tool fascinates me as a double humanities/stem major. It's admittedly got a way to go, but I would love to learn more about it.

    --email from undergraduate in Connecticut, May 18 15:00
  76. Thanks for the note and your interest! We’re thrilled that the Text Analyzer tool is intriguing you. What are you most interested in learning about? Given your interest, if you haven’t seen it yet, some of your questions might be answered by this blog post:  http://labs.jstor.org/blog/#!under_the_hood_of_text_analyzer . If you have more questions, let us know and we’ll help however we can!

    (Three cheers for combining humanities with STEM! Keep it up – the world needs people like you.)

    --reply to Connecticut undergrad from Alex, May 18 16:17
  77. Alex Humphreys of JSTOR Labs presents at June 15th NFAIS Webinar, Shifting Patterns in Search and Discovery. See https://t.co/86pXNKB7DD https://t.co/jQTSKUc4of
    Alex Humphreys of JSTOR Labs presents at June 15th NFAIS Webinar, Shifting Patterns in Search and Discovery. See  http://bit.ly/2q4zDKk  pic.twitter.com/jQTSKUc4of
  78. I used Mozilla Firefox for the search. Attached is the document for your perusal.

    --student in Ghana following up to our request for more info after he experienced a problem with Text Analyzer, May 23 12:14
  79. Thanks so much for sending this document to me! When I run it on Text Analyzer using either Firefox or Chrome, it works fine for me, which means that I don’t think the problem is with the document itself. I have two theories, but to test them it would help if you could try to use Text Analyzer on this document again.

    The first theory is that your internet connection was a bit slow and this timed out while uploading. Were you on a slow connection at a time? Could you try again? Did other, smaller documents work for you?

    The second theory is that there could be a problem with your version of Firefox. Do you have another browser that you could try it on?

    Thanks again for getting back to us and working with us to figure this out. I’m sure we can get this working for you.

    --reply to Ghana student from Alex, May 23 16:27
  80. Hi. Thanks for the reply. I accessed JSTOR using my county library credentials.

    I was testing Text Analyzer on a number of legal documents. I first tried the ACA case downloaded as an RTF file from Wesltaw. That's a large, two-column document with a lot of hyperlinks. I'm not surprised it didn't work. I also tested an amicus brief from the Lee v. Tam case. That single column text document was about 55 pages and it had no hyperlinks. It worked great.

    I'm giving a CLE in June at the state bar association convention. It's a '60 tips in 60 minutes' style presentation. I want to spend a minute referencing Text Analyzer. It's a nice counterpoint to Casetext's CARA tool. Do you know it? See, casetext.com/cara/upload.

    --lawyer in Minnesota following up to our request for more info after he experienced a problem with Text Analyzer, May 24 14:32
  81. Thanks, this is incredibly helpful! Would you be willing to email me the ACA case (I promise to keep it private) so I can try it on this side? There’s no reason that a two-column file should fail, and if it is, that’s something we want to address.

    I’d not seen CARA! You’re right, it certainly looks like we’re both playing with the same set of ideas – perhaps it’s just an idea whose time has come. I’ll explore the tool further.

    Best of luck with your presentation. Do let me know if there’s any information about Text Analyzer that would be helpful as you create it.

    --reply to MN lawyer from Alex, May 24 17:47
  82. The ACA case is just the SCT Obamacare decision (123 S.Ct. 2566). No need to keep it private. See attached.

    CARA is simply performing a citation analysis (I think) whereas Text Analyzer is identifying key terms (I assume). Any description of how your tool is processing a text document would be very helpful.

    For what it's worth, processing a legal memorandum in Text Analyzer will appear a bit backwards to some. Lawyers are encouraged to start with secondary sources (treatises, law review articles, etc.), then move on to primary sources (statutes, case law, etc.) So, I'm thinking of presenting the material this way:

    Drafting your memo is an iterative process...

    Step 1: Start with very rough draft citing only well-known, key cases
    Step 2: Use Text Analyzer to find additional arguments in journals and other secondary sources
    Step 3: Draft brief using conventional research tools, cite to all relevant authority
    Step 4: Use CARA to make sure you're not missing anything...

    I welcome your insight...

    --another followup from the MN lawyer, May 24 17:47
  83. Right, the ACA case. That thing. (Sorry, I was context-switching and didn’t pause to think about the acronym...)

    Thanks for the doc itself – yep, that’s not working for me either. We’ll look into it. Thanks for identifying the problem and sharing it with us.

    You can read about how Text Analyzer processes a document here: jstor.org/analyze/about and, for a more technical and specific description, here:  http://labs.jstor.org/blog/#!under_the_hood_of_text_analyzer .

    Your process and description sounds entirely reasonable to these non-lawyerly ears. I hope it’s well-received and of course, if you or your audience identify features that would make Text Analyzer more valuable to lawyers writing memos, I hope you’ll let me know.

    --reply to MN lawyer from Alex, May 24 17:51
  84. Hi! I was wondering if you could allow the possibility of moving tags from different categories (for example, the name "Santiago Gamboa" is recognised as a location. Thanks!

    --email from history professor in Chile, May 25 9:29
  85. Thanks for the feedback, Maria.

    It’s a great suggestion. In the current version of the Text Analyzer the specific entity category isn’t that important as all terms are searched in all fields in our index. In future versions we may well do more targeted indexing and searching, and in that case having the correct category will be important. We’re looking at improvements in our entity recognition processing to better categorize the entities found. The entity recognition engines we use currently do a reasonably good job of inferring entity type based on text usage, but the misinterpretation of location and person names is still one the more common problems we see in entity recognition. It’s tricky. As another example, my hometown of Ann Arbor Michigan is often interpreted as a person rather than a location. We’ll probably use gazetteers for improved location disambiguation in an upcoming version of Text Analyzer which should help with terms like "Santiago Gamboa" and “Ann Arbor”.

    Thanks for giving Text Analyzer a spin. How did it work for you otherwise? The tool is in active development and feedback from users is really helpful as we work to improve it.

    --reply to Chile professor from Ron, May 25 10:37
  86. Thank you very much for your prompt reply.
    Otherwise, I love Text Analyzer and will definitely be a frequent user and promoter!

    --another email from history professor in Chile, May 25 11:06
  87. Actually it is a large word document of about 14000 words with image and tables

    --geomorphology Ph.D. student in the UK, following up to our request for more info after experiencing a problem with Text Analyzer, May 25 13:57
  88. That *is* a large file. That said, I’ve run Text Analyzer on 200,000 word books, so it *should* work (and images and tables shouldn’t be a problem). What kind of connection to the internet are you on? If it’s a slow connection, it’s possible that the file is just taking too long to upload. Have you been able to run Text Analyzer successfully on other, smaller files?

    Thanks – hopefully we can get this working for you.

    --reply to UK geomorphology Ph.D. student from Alex, May 25 14:31
  89. Internet at the University of Nottingham

    --another email from geomorphology Ph.D. student in the UK, May 25 14:42
  90. Thanks. Have you been able to use Text Analyzer on other, smaller documents? Would you be willing to email the file to me to try here?

    --reply to UK geomorphology Ph.D. student from Alex, May 25 14:46
  91. Hello,
    I tried with several papers. Small ones worked just fine, but this, not being too big (3.87 MB) gave the trouble.
    The tool is great! I hope this interaction helps in improving it.
    Greetings, and thanks a lot for making the app available and your speedy answer.

    --director of botanical garden in Spain, following up to our request for more info after experiencing a problem with Text Analyzer, May 30 7:56
  92. Thanks so much and I’m glad to hear that it worked fine for other documents. Depending on your connection speed, uploading a file that large could have caused a problem. Have you tried it again? Would you be willing to share the document with us so that we can see if we can run it on this side?

    --reply to Spanish botanical garden director from Alex, May 30 9:03
  93. Attached is the paper giving me problems; I already sent it in my previous message. Didn't you get it?

    My connection is 1 Gbps: [screenshot]

    --another email from director of botanical garden in Spain, May 30 10:26
  94. Thanks, [redacted], I received the paper (my apologies, I missed it in the previous time).

    The good news is that this document doesn’t work for me, either, which means that we can reproduce the problem and can started trying to fix it. Thanks again, and we’ll let you know if and when we are able to resolve the problem.

    --reply to Spanish botanical garden director from Alex, May 30 11:05
  95. Have you tried Text Analyzer yet? Upload your doc and get research suggestions based on the text! pic.twitter.com/bb9Hko2YBa
  96. Have you tried JSTOR's next text analyzer yet? 1/3
  97. You download the text that you are writing and JSTOR will look for relevant documents to support your writing! 2/3
  98. @SquiresLibrary 4/3 And then let us know how it worked for you!! We here at @JSTOR Labs are working hard to keep improving this tool!
  99. I’m happy to report that we believe we have fixed the issue that was keeping the file you sent us from being processed in Text Analyzer. Your sending us the document was exactly what we needed to pinpoint the problem, and we’re hoping that this fix helps not just you but other users affected by the same problem. Thanks *so* much for your help in fixing this!

    Do let us know if you’re still having any problems, and, once again, thanks for helping us to make Text Analyzer better.

    --reply to Spanish botanical garden director from Alex, May 31 9:53
  100. It worked! Great!
    Thanks a lot. The tools is fantastic. We will carry on working with it.

    --another email from director of botanical garden in Spain, June 1 5:53