Text Nailing (TN) is an information extraction method of semi-automatically extracting structured information from unstructured documents. The method allows a human to interactively review small blobs of text out of a large collection of documents, to identify potentially informative expressions. The identified expressions can be used then to enhance computational methods that rely on text (e.g., Regular expression) as well as advanced natural language processing (NLP) techniques. TN combines two concepts: 1) human-interaction with narrative text to identify highly prevalent non-negated expressions, and 2) conversion of all expressions and notes into non-negated alphabetical-only representations to create homogeneous representations. [1] [2]
In traditional machine learning approaches for text classification, a human expert is required to label phrases or entire notes, and then a supervised learning algorithm attempts to generalize the associations and apply them to new data. In contrast, using non-negated distinct expressions eliminates the need for an additional computational method to achieve generalizability.[3] [4] [5]
TN was developed at Massachusetts General Hospital and was tested in multiple scenarios including the extraction of smoking status, family history of coronary artery disease, identifying patients with sleep disorders,[6] improve the accuracy of the Framingham risk score for patients with non-alcoholic fatty liver disease, and classify non-adherence to type-2 diabetes. A comprehensive review regarding extracting information from textual documents in the electronic health record is available.[7] [8]
The importance of using non-negated expressions to achieve an increased accuracy of text-based classifiers was emphasized in a letter published in Communications of the ACM in October 2018.[9]
A sample code for extracting smoking status from narrative notes using "nailed expressions" is available in GitHub.[10]
In July 2018 researchers from Virginia Tech and University of Illinois at Urbana–Champaign referred TN as an example for progressive cyber-human intelligence (PCHI).[11]
Chen & Asch 2017 wrote "With machine learning situated at the peak of inflated expectations, we can soften a subsequent crash into a “trough of disillusionment” by fostering a stronger appreciation of the technology’s capabilities and limitations."[12]
A letter published in Communications of the ACM, "Beyond brute force", emphasized that a brute force approach may perform better than traditional machine learning algorithms when applied to text. The letter stated "... machine learning algorithms, when applied to text, rely on the assumption that any language includes an infinite number of possible expressions. In contrast, across a variety of medical conditions, we observed that clinicians tend to use the same expressions to describe patients' conditions."[13]
In his viewpoint published in June 2018 concerning slow adoption of data-driven findings in medicine, Uri Kartoun, co-creator of Text Nailing states that " ...Text Nailing raised skepticism in reviewers of medical informatics journals who claimed that it relies on simple tricks to simplify the text, and leans heavily on human annotation. TN indeed may seem just like a trick of the light at first glance, but it is actually a fairly sophisticated method that finally caught the attention of more adventurous reviewers and editors who ultimately accepted it for publication."[14]
The human in-the-loop process is a way to generate features using domain experts. Using domain experts to come up with features is not a novel concept. However, the specific interfaces and method which helps the domain experts create the features are most likely novel.
In this case the features the experts create are equivalent to regular expressions. Removing non-alphabetical characters and matching on "smokesppd" is equal to the regular expression /smokes[^a-zA-Z]*ppd/. Using regular expressions as features for text classification is not novel.
Given these features the classifier is a manually set threshold by the authors, decided by the performance on a set of documents. This is a classifier, it's just that the parameters of the classifier, in this case a threshold, is set manually. Given the same features and documents almost any machine learning algorithm should be able to find the same threshold or (more likely) a better one.
The authors note that using support vector machines (SVM) and hundreds of documents give inferior performance, but does not specify which features or documents the SVM was trained/tested on. A fair comparison would use the same features and document sets as those used by the manual threshold classifier.