home

Natural language processing

Some interests

I have worked on most topics in NLP. I use most of the time machine learning approaches and yet I feel also at east with using rule-based ones. In that case I try to make them as scalable and flexible as possible. Here you can see a non-exhaustive list of topics in NLP that interest me:
  1. Language interfaces: basically, the conversion of natural language input into semantics and the other way back. There are multiple issues involved in this, from parsing text to developing a comprehensive and manageable semantic representation to designing the best forms of interaction with the user to the selection of suppporting resources
  2. Automatic sense disambiguation: this issue, obviously, is closely related to 1). Here what I like discovering is how the network properties based on probabilistic and semantic features can help in deducing sense. Also I believe it is possible to build very robust systems that employ clues from multilingual resources.
  3. Creating algorithms that can handle the analysis of different human languages with different morphologies, syntaxes and writing systems. I love Unicode and encoding issues, I love to abstract and implement code to recognise useful information in documents with languages as different as Chinese and Russian, Dutch and Arabic
  4. Last but not least, working on automating to the fullest the conversion of structured information (not only from language source but images) into reliable, actionable data.

You can find a list of some publications of mine here

I describe very briefly the concept of dynamic lexical relations here

I list some NLP publications I like in the technical book page.

Some general online resources for NLP

Gensim One of the handiest tools for word embeddings (and more)
Spacy Very well-known NLP tool in Python
NLTK Another well-known, a bit older, NLP tool in Python
Unicode All about those Unicode characters
UIMA IBM's framework for annotating unstructured content
SENSEVAL The standard for sense disambiguation


Andrés Domínguez Burgos, 2024 ©