Natural language processing

Some interests

I have worked on most topics in NLP. I use most of the time machine learning approaches, from the classical approaches to deep learning. I am familiar with the different large language models architectures. I still think some rule-based approaches have a place to play in the NLP world. Still, we should create the required data using different ML techniques. Here you can see a non-exhaustive list of topics in NLP that interest me:

Interaction of LLMS within different orchestrating mechanisms (aka going beyond RAGS): Since the latest AI craze triggered by chats we have seen, yet again, a lot of talk about agents. I believe there are different approaches being developed out there but we need to develop more sophisticated architectures in this area
Language interfaces: basically, the conversion of natural language input into semantics and the other way back. There are multiple issues involved in this, from parsing text to developing a comprehensive and manageable semantic representation to designing the best forms of interaction with the user to the selection of suppporting resources
Automatic sense disambiguation: this issue, obviously, is closely related to 1). Here what I like discovering is how the network properties based on probabilistic and semantic features can help in deducing sense. Also I believe it is possible to build very robust systems that employ clues from multilingual resources.
Creating algorithms that can handle the analysis of different human languages with different morphologies, syntaxes and writing systems. I love Unicode and encoding issues, I love to abstract and implement code to recognise useful information in documents with languages as different as Chinese and Russian, Dutch and Arabic
Last but not least, working on automating to the fullest the conversion of structured information (not only from language source but images) into reliable, actionable data.

You can find a list of some publications of mine here

I describe very briefly the concept of dynamic lexical relations here

I list some NLP publications I like in the technical book page.

Some general online resources for NLP

Hugging Face	One of the best known companies providing ML services, including a transformer library.
Gensim	One of the handiest tools for word embeddings (and more)
Spacy	Very well-known NLP tool in Python
NLTK	Another well-known, a bit older, NLP tool in Python
Unicode	All about those Unicode characters
UIMA	IBM's framework for annotating unstructured content
SENSEVAL	The standard for sense disambiguation