Skip Navigation Links

SCENT Study: Identifying primary and recurrent cancer diagnosis with SAS computer program


Accurate identification of primary and recurrent cancer diagnoses is critically important to clinical researchers. Traditional identification methods and electronic diagnosis codes have significant limitations. To overcome these limitations and further the science of clinical cancer research, Kaiser Permanente Southern California researchers have developed a SAS-based coding, extraction, and nomenclature tool (SCENT). SCENT uses natural language processing to identify and extract information from the text of electronic pathology reports. The popularity of SAS statistical software in clinical research settings will make SCENT highly accessible.

To assess the accuracy of SCENT, researchers conducted a validation study using pathology reports of randomly selected breast and prostate cancer patients. The tool successfully identified 97 percent (111/115) of confirmed cancer diagnoses and produced only a few false positives (3/792). Additional information about SCENT is available in a peer-reviewed publication at the Journal of the American Medical Informatics Association.




Virginia P. Quinn, PhD, MPH, Principal Investigator - Visit Virginia Quinn's Scientist page to learn more about her work.

Chun Chao, PhD, MS, Co-Investigator - Visit Chun Chao's Scientist page to learn more about her work.

Justin Strauss, MA, Research Associate III - Justin Strauss provides support to ongoing epidemiologic studies and data infrastructure initiatives in the department of Research & Evaluation at Kaiser Permanente Southern California. His background is in demographic analysis and he has experience working with a variety of computer programming languages. Justin worked with research scientists and SCPMG clinical leaders to conceptualize, develop and validate the SCENT program.


SCENT Program

The SCENT program will be coming soon.