Résumé parsing explained

Resume parsing, also known as CV parsing, resume extraction, or CV extraction, allows for the automated storage and analysis of resume data. The resume is imported into parsing software and the information is extracted so that it can be sorted and searched.

Principle

Resume parsers analyze a resume, extract the desired information, and insert the information into a database with a unique entry for each candidate.[1] Once the resume has been analyzed, a recruiter can search the database for keywords and phrases and get a list of relevant candidates. Many parsers support semantic search, which adds context to the search terms and tries to understand intent in order to make the results more reliable and comprehensive.[2]

Machine learning

Machine learning is extremely important for resume parsing. Each block of information needs to be given a label and sorted into the correct category, whether that's education, work history, or contact information.[3] Rule-based parsers use a predefined set of rules to parse the text. This method does not work for resumes because the parser needs to "understand the context in which words occur and the relationship between them."[4] For example, if the word "Harvey" appears on a resume, it could be the name of an applicant, refer to the college Harvey Mudd, or reference the company Harvey & Company LLC. The abbreviation MD could mean "Medical Doctor" or "Maryland". A rule-based parser would require incredibly complex rules to account for all the ambiguity and would provide limited coverage.

Natural language processing (NLP) is a branch of artificial intelligence which uses machine learning to make predictions and to understand content and context.[5] Acronym normalization and tagging accounts for the different possible formats of acronyms and normalizes them. Lemmatization reduces words to their root using a language dictionary and stemming removes “s”, “ing”, etc. Entity extraction uses regular expressions, dictionaries, statistical analysis and complex pattern-based extraction to identify people, places, companies, phone numbers, email addresses, important phrases and more.

Effectiveness

Resume parsers have achieved up to 87% accuracy,[6] which refers to the accuracy of data entry and categorizing the data correctly. Human accuracy is typically not greater than 96%, so the resume parsers have achieved "near human accuracy."[7]

One executive recruiting company tested three resume parsers and humans to compare the accuracy in data entry. They ran 1000 resumes through the resume parsing software and had humans manually parse and enter the data. The company brought in a third party to evaluate how the humans did compared to the software. They found that the results from the resume parsers were more comprehensive and had fewer mistakes. The humans did not enter all the information on the resumes and occasionally misspelled words or wrote incorrect numbers.[8]

In a 2012 experiment, a resume for an ideal candidate was created based on the job description for a clinical scientist position. After going through the parser, one of the candidate's work experiences was completely lost due to the date being listed before the employer. The parser also didn't catch several educational degrees. The result was that the candidate received a relevance ranking of only 43%. If this had been a real candidate's resume, they wouldn't have moved on to the next step even though they were qualified for the position.[9] It would be helpful if a similar study was conducted on current resume parsers to see if there have been any improvements over the past few years.

Benefits

Challenges

The parsing software has to rely on complex rules and statistical algorithms to correctly capture the desired information in the resumes. There are many variations of writing style, word choice, syntax, etc. and the same word can have multiple meanings. The date alone can be written hundreds of different ways. It is still a challenge for these resume parsers to account for all the ambiguity. Natural Language Processing and Artificial Intelligence still have a way to go in understanding context-based information and what humans mean to convey in written language.

Resume optimization

Resume parsers have become so omnipresent that it is now recommended that candidates focus on writing to the parsing system rather than to the recruiter. The following techniques have been proposed to increase the probability of success:

  1. Use keywords from the job description in relevant places on your resume.
  2. Don't use headers or footers, since they may confuse the parsing algorithms.[15]
  3. Use a simple style for fonts, layouts and formatting.
  4. Avoid graphics.
  5. Use standard section names such as “Work Experience” and “Education”.
  6. Avoid using acronyms unless they're included in the job description.
  7. Don't start with dates in the "Work Experience" section.
  8. Stay consistent with formatting past work experience.
  9. Send resume in docx, doc and PDF file formats.

Software and vendors

There are many stand-alone options for resume parsers including[16] RChilli, Skillate, CandidateZip, Sovren, Daxtra, Textkernel, Hireability and they are also typically bundled in with applicant tracking systems, which are used by companies to streamline the hiring process.[17]

With recent advancements in machine learning, the text mining and analysis processes, which ensure up to 95% accuracy in data processing, many AI technologies[18] have sprung up to help the job seekers in the creation of application documents. These services focus on creating ATS-friendly resumes, execute resume check and screening, and help with all of the preparation and application processes. Some of the AI builders, such as Leap.ai and Skillroads, concentrate on the resume creation while others, like Stella, also offer help with the job hunt itself as they match candidates to appropriate vacancies. In 2017, Google launched Google for Jobs. This expansion to the search engine uses Cloud Talent Solution,[19] Google's own iteration of the AI resume builder and matching system.

Future

Resume parsers are already standard in most mid- to large-sized companies and this trend will continue as the parsers become even more affordable.

A qualified candidate's resume can be ignored if it is not formatted the proper way or doesn't contain specific keywords or phrases. As Machine Learning and Natural Language Processing get better, so will the accuracy of resume parsers.

One of the areas resume parsing software is working on expanding into is performing contextual analysis on the information in the resume rather than purely extracting it. One employee at a parsing company said “a parser needs to classify data, enrich it with knowledge from other sources, normalize data so it can be used for analysis and allow for better searching.” [20]

Parsing companies are also being asked to expand beyond just resumes or even LinkedIn profiles. They are working on extracting information from industry-specific sites such as GitHub and social media profiles.  

Notes and References

  1. “What Is CV/Resume Parsing?” Daxtra, Daxtra Technologies Ltd, 18 Oct. 2016, https://info.daxtra.com/blog/2016/10/18/what-is-cvresume-parsing.
  2. Ratcliff, Christopher. “Search Engine Watch.” What Is Semantic Search and Why Does It Matter?, ClickZ Group Limited, 21 Oct. 2015, searchenginewatch.com/sew/opinion/2431292/what-is-semantic-search-and-why-does-it-matter.
  3. “Is Your Resume Ready for Automated Screening?” Resume Hacking, Resume Hacking, 2 Jan. 2016, www.resumehacking.com/ready-for-automated-resume-screening.
  4. Nelson, Paul. "Natural Language Processing (NLP) Techniques for Extracting Information." Search Technologies, Search Technologies, www.searchtechnologies.com/blog/natural-language-processing-techniques.
  5. Reynolds, Brandon. “The Terrible Trouble with Natural Language Processing (It's Us.).”Salesforce Blog, Salesforce.com, Inc., 17 Aug. 2016, www.salesforce.com/blog/2016/08/trouble-with-natural-language-processing.html.
  6. Web site: HR software companies? Why structuring your data is crucial for your business?. 15 April 2019.
  7. “The Ultimate Guide to CV/Resume Parsing.” Daxtra, Daxtra Technologies Ltd, 30 Jun. 2022, https://info.daxtra.com/the-ultimate-guide-to-cv-resume-parsing.
  8. "A Top Executive Recruiter Puts Accuracy to the Ultimate Test." Resume Parsing: Putting Accuracy to the Ultimate Test, Sovren Group, Inc., www.sovren.com/resource-center/a-top-executive-recruiter-puts-accuracy-to-the-ultimate-test/.
  9. Levinson, Meridith. “5 Insider Secrets for Beating Applicant Tracking Systems (ATS).”CIO, CIO, 1 Mar. 2012, www.cio.com/article/2398753/careers-staffing/careers-staffing-5-insider-secrets-for-beating-applicant-tracking-systems.html.
  10. Bertrand . Marianne . Mullainathan . Sendhil . July 2003 . Are Emily and Greg More Employable than Lakisha and Jamal? A Field Experiment on Labor Market Discrimination . National Bureau of Economic Research . en . 9873 . 10.3386/w9873 . free .
  11. “3 Ways Recruiters Can Use AI to Reduce Unconscious Bias.” Undercover Recruiter, 12 May 2017, theundercoverrecruiter.com/ai-reduce-unconscious-bias/.
  12. “Baby Steps in HR Technology: What Is Resume Parsing?” Recruiterbox, Recruiterbox Inc, 12 Oct. 2017, recruiterbox.com/blog/baby-steps-in-hr-technology-what-is-resume-parsing-2/.
  13. Cain, Áine. “The Real Reason 60% of Job Seekers Can't Stand the Application Process.” Business Insider, Business Insider, 16 June 2016, www.businessinsider.com/why-most-ob-seekers-cant-stand-the-application-process-2016-6.
  14. Schultz, Carol. “Got a Minute? If So, Spend It Looking at Resumes.” ERE, ERE Media., 3 May 2012, www.ere.net/got-a-minute-if-so-spend-it-looking-at-resumes/.
  15. Cappelli, Peter. “How to Get a Job? Beat the Machines.” Time, Time Inc., 11 June 2012, business.time.com/2012/06/11/how-to-get-a-job-beat-the-machines/.
  16. Web site: What is the best resume parsing software?.
  17. Hu, James. “Your Top 7 Questions About Applicant Tracking Systems, Answered.”Recruiter, Recruiter.com, Inc., 16 Aug. 2017, www.recruiter.com/i/your-top-7-questions-about-applicant-tracking-systems-answered/.
  18. Web site: AI technologies that help you to get hired. Skillroads.
  19. Web site: Cloud Talent Solution. Google.
  20. Zielinkski, Dave. “Does Your Resume Parser Stack Up? How to Evaluate Next-Generation Systems.” SHRM Society for Human Resource Management, SHRM, 10 May 2016, www.shrm.org/resourcesandtools/hr-topics/technology/pages/does-your-resume-parser-stack-up-how-to-evaluate-next-generation-systems.aspx?sthash.2dz2wgkl.mjjo.