Natural language processing
Some interests
I have worked on most topics in NLP. I use most of the time machine learning approaches, from the classical
approaches to deep learning.
I am familiar with the different large language models architectures.
I still think some rule-based approaches have a place to play in
the NLP world. Still, we should create the required data
using different ML techniques.
Here you can see a non-exhaustive list of topics in NLP that interest me:
- Interaction of LLMS within different orchestrating mechanisms (aka going beyond RAGS): Since the latest
AI craze triggered by chats we have seen, yet again, a lot of talk about agents.
I believe there are different approaches being developed out there but
we need to develop more sophisticated architectures in this area
- Language interfaces: basically, the conversion of natural language input into semantics and the other way back. There are multiple issues involved in this, from parsing text to developing a comprehensive and manageable semantic representation to designing the best forms of interaction with the user to the selection of suppporting resources
- Automatic sense disambiguation: this issue, obviously, is closely related to 1). Here what I like discovering is
how the network properties based on probabilistic and semantic features can help in deducing sense. Also I believe it is possible to build very robust systems that employ clues from multilingual resources.
- Creating algorithms that can handle the analysis of different human languages with
different morphologies, syntaxes and writing systems. I love Unicode and encoding issues,
I love to abstract and implement code to recognise useful information in documents with languages
as different as Chinese and Russian, Dutch and Arabic
- Last but not least, working on automating to the fullest the conversion of structured
information (not only from language source but images) into reliable, actionable data.
You can find a list of some publications of mine here
I describe very briefly the concept of dynamic lexical relations here
I list some NLP publications I like in the technical book page.
Some general online resources for NLP
Hugging Face |
One of the best known companies providing ML services, including
a transformer library. |
Gensim |
One of the handiest tools for word embeddings (and more) |
Spacy |
Very well-known NLP tool in Python |
NLTK |
Another well-known, a bit older, NLP tool in Python |
Unicode |
All about those Unicode characters |
UIMA |
IBM's framework for annotating unstructured content |
SENSEVAL |
The standard for sense disambiguation |
Andrés Domínguez Burgos, 2025 ©