Natural language processing
Natural Language Processing (NLP) is concerned with the exploration of computational techniques to learn, understand and produce human language content. NLP technologies can assist both human-human communication (e.g. machine translation) and human-machine communication (e.g. conversational interfaces and automated personal assistants), and can analyse and learn from the vast amount of textual data available online.
NLP is important to the development of intelligent interfaces, to explainable Artificial Intelligence (AI), and to data science. This strategy notes the opportunities for increased activity and for maintaining our capability in mainstream statistical NLP within UK academia. In parallel with foci on data science and intelligent interfaces we aim to maintain mainstream statistical NLP capability.
By the end of the current Delivery Plan, we aim to have:
- A research and training portfolio that contributes to development of new intelligent interfaces with NLP at their core. NLP will increasingly serve as an interface for communicating between humans and systems (e.g. in the Internet of Things) and dialogue management will become increasingly important, linking NLP with the related fields of Speech Technologies and human-robot interaction. Researchers should also be encouraged to address challenges in multi-modal interfaces (e.g. by exploring and exploiting the links between language and vision)
- A portfolio of research and training that includes work on enabling extraction of knowledge from large-scale textual data. The opportunity exists for researchers to target interdisciplinary work in this area (e.g. textual analytics enabling analysis of medical records)
- Researchers working towards the goal of computing with meaning, contributing to the broad objective in Artificial Intelligence (AI) of developing computational methods for ascribing semantics to human behaviours (e.g. natural human interaction)
- A supply of people with high-level skills, reflecting increasingly acute demand as NLP technologies are used in an increasing number of applications
Researchers have the opportunity to play an important role in delivering the objectives of EPSRC's Future Intelligent Technologies and Data Enabled Decision Making cross-ICT priorities, and are well-placed to contribute to the other cross-ICT priorities. To maximise impact, they should ensure effective communication with researchers in areas such as AI Technologies, Visualisation and Human-Computer Interaction (HCI).
Responsible Innovation is a significant consideration. Researchers should be encouraged to address issues of trust, identity and privacy with regard to how NLP is used in social contexts and large-scale social networks.Highlights:
In view of the recent growth of the AI technologies portfolio, in large part attributed to machine learning (ML) methods, it is clear that the research landscape in this area has changed significantly. The UK has a particular strength in its depth of experience in combining NLP with ML methods. NLP has been mentioned explicitly in the AI sector deal (Evidence Source 9) in relation to aiming to increase the AI workforce as part of Industrial action. NLP is also noted in the 2017 Hall and Pesenti report ‘’Growing the artificial intelligence industry in the UK’’ (Evidence Source 8) for ‘’ Improving the resilience of UK industry from cyber attracts using active search, natural language processing and automated code and security integrity verification methods.’’
A combination of factors (the application of ML methods to vast amounts of linguistic data, and a significant increase in computing power) has led to recent advances in NLP and related technologies and this is likely to continue. The UK has a small number of world-leading NLP research groups and is considered internationally competitive and is therefore well-placed to capitalise on advances in this area, provided there is increased capacity to do so. Capacity is currently low but we wish to support the future success of the research base as demand for capability to create and integrate intelligent interfaces increases.
Industrial strength at the interface of speech/language technologies and ML is evidenced by the significant investment being made by major IT companies (e.g. Amazon, Google and Apple), who have created or expanded UK-based research facilities and are heavily recruiting researchers with NLP expertise. There have been several recent, high-profile UK start-up acquisitions (e.g. Dark Blue Labs by Google, VocalIQ by Apple and SwiftKey by Microsoft) and significant growth in commercial interest in using NLP technologies (e.g. in conversational interfaces and automated personal assistants).
There is a need to ensure a supply of people with high-level skills in NLP. As noted above, major IT companies are heavily recruiting staff with PhD and postdoctoral experience in NLP. However, retention of expertise and key capacity in academia beyond PhD level is a recognised problem that will become even more acute as NLP and related technologies are utilised in an increasing number of applications.
NLP is a significant research area for data science, as it enables management of unstructured data (e.g. patient records, arts, heritage, literature, and/or the legal domain), and for Robotics and Autonomous Systems (RAS), where NLPs used for human-robot interaction in integrated RAS systems is increasingly important. In general terms, NLP is important to the health of related disciplines (e.g. HCI, Robotics and AI) as both a driver of research and a user/collaborator/magnifier for those disciplines’ research outputs.
NLP is expected to contribute significantly to the Connected Nation Outcome and, at a lesser level and/or over a longer timeframe, to the other Outcomes. Specific Ambitions of particular relevance are:
C1: Enable a competitive, data-driven economy
NLP will contribute to the interfaces that will be part of the smart tools and analytical techniques needed to generate actionable information from large and diverse datasets.
C2: Achieve transformational development and use of the Internet of Things
Communication between a wide range of sensors and devices, and their interaction with people, will lead to the next revolution in products and services. NLP will contribute to the way information can be intelligently assimilated and communicated.
C3: Deliver intelligent technologies and systems
NLP will contribute to the smart tools and intelligent technologies that will take the Connected Nation beyond data flows and turn data flows into physical action, and will increasingly serve as an interface for communicating between intelligent systems and the people using them.
C4: Ensure a safe and trusted cyber society
NLP researchers have the opportunity to contribute to development of new tools to analyse and interpret data for large-scale systems in order to detect crime and terrorism, as well as addressing other security issues, while ensuring citizens’ privacy and trust (e.g. understanding how those who would do us harm use the internet).
R3: Develop better solutions to acute threats: cyber, defence, financial and health
NLP can help ensure better ability to identify emerging threats or anomalous patterns within existing and future complex data environments.
- EPSRC, Analysis of Research Excellence Framework (REF) 2014 data and EPSRC Knowledge Maps, (2014)
- CITIA, CITIA Roadmap, (2016)
- C.D. Manning, (2015), Computational Linguistics and Deep Learning. ACL 41(4), 701-707
- Community and user engagement (individual input and group feedback)
- J. Hirschberg and C.D. Manning, (2015), Advances in Natural Language Processing. Science 349(6), 261-266
- EPSRC, Output from the Speech Technologies exceptions process, (2015)
- IT Jobs Watch, Tracking the IT Job Market, (2016)
- Growing the artificial intelligence industry in the UK, (2017)
- BEIS, Artificial Intelligence Sector Deal, (2018)
Research area connections
This diagram shows the top 10 connections between Research Areas within the EPSRC research portfolio. The depth of the segment relates to value of grants and the width of the segment relates to the number of grants shared by those two Research Areas. Please click to see the related Research Area rationale.
Visualising our Portfolio (VoP)
Visualising our portfolio (VoP) is a tool for users to visually interact with the EPSRC portfolio and data relationships.
EPSRC support by research area in natural language processing (GoW)
Search EPSRC's research and training grants.