Clinical registries offer crucial benefits to the surgical sciences. Because there are fewer randomized controlled trials in surgery, registries are critical to supporting research, device monitoring, and trial development. Currently, they are costly to create and challenging to maintain, requiring the skilled labor of physicians or nurses to enter data manually into registries.  

In a study published in Neurosurgery, NYU Langone Health researchers used neurosurgeons’ operative notes to inform an accurate and interpretable natural language processing (NLP) algorithm that could generate an automated registry of spine surgery.

NLP combs through data to pinpoint keywords that capture the meaning of a body of text, and in doing so this method has the potential to reduce the time and cost that make scaling up manual registries challenging. This creates the potential to facilitate more clinical research opportunities and better monitor care.

“Previously surgeons had to manually enter registry data, or rely upon programming methods set up by computer scientists. Now, surgeons can set up the computational model instead of manually doing the work themselves.”

Eric K. Oermann, MD

“Previously surgeons had to manually enter registry data, or rely upon programming methods set up by computer scientists,” says Eric K. Oermann, MD, senior author on the study. “Now, surgeons can set up the computational model instead of manually doing the work themselves.”

Human in the Loop

This model is considered a human-in-the-loop approach, where a computational or artificial intelligence (AI) model greatly benefits from human interaction.

According to Dr. Oermann, surgeons actively participating in training the model were asked questions like “How do you talk about surgery?” and “How do you write about surgery?” They then used a pattern-matching method within NLP to design classifiers that could order or categorize this information on a surgeon-by-surgeon basis.

Researchers began with the NYU Langone “data lake,” which combines unstructured data from the electronic health record (EHR), billing department, imaging systems, and other sources of information. They ran a structured language query to collect 31,502 notes—including spine, cranial, and peripheral cases—for analysis. This data was processed using regular expressions (regex) classifiers, 650 lines of code, written in collaboration with operating surgeons and trainees.

In this study, the relative simplicity of the regex method provided transparency in interpreting results as compared with more sophisticated AI models. AI models are often described as black boxes. It can be difficult to understand how they arrive at their decisions, making them challenging to deploy in healthcare settings.

The generated autoregistry of operative notes that were run through the regex classifiers included 14 labeled spine procedures. NLP classifiers had an average accuracy of 98.86 percent at identifying both spinal procedures and relevant vertebral levels. They also correctly identified the entire list of defined surgical procedures in 89 percent of patients. Researchers were able to identify patients who required additional operations within 30 days, monitoring outcomes and quality metrics.

Next Steps

Because of the specificity used in surgeons’ language, this study cannot be generalized to other hospitals. However, the study demonstrates the method could be replicated in other settings to generate additional autoregistries. This would require neurosurgery domain expertise and practical NLP programming skills. With relatively simple computing requirements, and cost-effective use of surgeons’ time, this method improves upon existing manual methods. Future directions for research include collecting additional spine outcomes data and comparing the regex classifiers against other machine learning models.