The 27th International Conference on Natural Language & Information Systems will be held at the Universitat Politècnica de València, Spain. Although we plan to organize the conference in an hybrid mode in order to allow the attendance to everybody, we would be delighted to host you in Valencia and count on you to have a fruitful and enjoyable conference. Since 1995, the NLDB conference brings together researchers, industry practitioners, and potential users interested in various application of Natural Language in the Database and Information Systems field. The term "Information Systems" has to be considered in the broader sense of Information and Communication Systems, including Big Data, Linked Data and Social Networks.
The field of Natural Language Processing (NLP) has itself recently experienced several exciting developments. In research, these developments have been reflected in the emergence of neural language models (Deep Learning, Word Embeddings, Transformers) and the importance of aspects such as transparency, bias and fairness, a (renewed) interest in various linguistic phenomena, such as in discourse and argumentation mining, and in new problems such as the detection of disinformation and hate speech in social media, as well of mental health disorders that increased during the recent pandemic. Regarding applications, NLP systems have evolved to the point that they now offer real-life, tangible benefits to enterprises. Many of these NLP systems are now considered a de-facto offering in business intelligence suites, such as algorithms for recommender systems and opinion mining/sentiment analysis.
It is against this backdrop of recent innovations in NLP and its applications in information systems that the 27th edition of the NLDB conference takes place. We welcome research and industrial contributions, describing novel, previously unpublished works on NLP and its applications across a plethora of topics as described in the Call for Papers.
For full details, please have a look at the Call for Papers.
This year's edition of NLDB also introduces an Industry Track, to foster fruitful interaction between the industry and the research community.
Topics of interest include but are not limited to:
Social Media and Web Analytics: Opinion mining/sentiment analysis, irony/sarcasm detection; detection of fake reviews and deceptive language; detection of harmful information: fake news and hate speech; sexism and misogyny; detection of mental health disorders; identification of stereotypes and social biases; robust NLP methods for sparse, ill-formed texts; recommendation systems.
Deep Learning and eXplainable Artificial Intelligence (XAI): Deep learning architectures, word embeddings, transparency, interpretability, fairness, debiasing, ethics.
Argumentation Mining and Applications: Automatic detection of argumentation components and relationships; creation of resource (e.g. annotated corpora, treebanks and parsers); Integration of NLP techniques with formal, abstract argumentation structures; Argumentation Mining from legal texts and scientific articles.
Question Answering (QA): Natural language interfaces to databases, QA using web data, multi-lingual QA, non-factoid QA(how/why/opinion questions, lists), geographical QA, QA corpora and training sets, QA over linked data (QALD).
Corpus Analysis: multi-lingual, multi-cultural and multi-modal corpora; machine translation, text analysis, text classification and clustering; language identification; plagiarism detection; information extraction: named entity, extraction of events, terms and semantic relationships.
Semantic Web, Open Linked Data, and Ontologies: Ontology learning and alignment, ontology population, ontology evaluation, querying ontologies and linked data, semantic tagging and classification, ontology-driven NLP, ontology-driven systems integration.
Natural Language in Conceptual Modeling: Analysis of natural language descriptions, NLP in requirement engineering, terminological ontologies, consistency checking, metadata creation and harvesting.
Natural Language and Ubiquitous Computing: Pervasive computing, embedded, robotic and mobile applications; conversational agents; NLP techniques for Internet of Things (IoT); NLP techniques for ambient intelligence
Big Data and Business Intelligence: Identity detection, semantic data cleaning, summarisation, reporting, and data to text.
Full paper submission (EXTENDED): : 14 March, 2022
Paper notification (EXTENDED): 10 April, 2022
Camera-ready deadline (EXTENDED): 20 April, 2022
Conference: 15-17 June 2022
If you are interested in attending the conference, please complete the registration form using the link Registration page.
Authors should follow the LNCS format (https://www.springer.com/gp/computer-science/lncs/conference-proceedings-guidelines) and submit their manuscripts in pdf via Easychair (https://easychair.org/conferences/?conf=nldb2022).
Submissions can be full papers (12 pages maximum including references), short papers (8 pages including references) or papers for a poster presentation or system demonstration (6 pages including references). The programme committee may decide to accept some full papers as short papers or poster papers.
Papers can be submitted to either the main conference or the industry track. The reviewing process of NLDB 2022 is double-blind, i.e., submissions to the main conference and to the industry track must not contain author names or other identifying information, such as funding sources, acknowledgments and must use the third person to refer to work the authors have previously undertaken. System demonstration papers may not be anonymous.
Few-shot Information Extraction is here: Pre-train, prompt and entail
Abstract: Deep Learning has made tremendous progress in Natural Language Processing (NLP), where large pre-trained language models (PLM) fine-tuned on the target task have become the predominant tool. More recently, in a process called prompting, NLP tasks are rephrased as natural language text, allowing us to better exploit linguistic knowledge learned by PLMs and resulting in significant improvements. Still, PLMs have limited inference ability. In the Textual Entailment task, systems need to output whether the truth of a certain textual hypothesis follows from the given premise text. Manually annotated entailment datasets covering multiple inference phenomena have been used to infuse inference capabilities to PLMs.
This talk will review these recent developments, and will present an approach that combines prompts and PLMs fine-tuned for textual entailment that yields state-of-the-art results on Information Extraction (IE) using only a small fraction of the annotations. The approach has additional benefits, like the ability to learn from different schemas and inference datasets. These developments enable a new paradigm for IE where the expert can define the domain-specific schema using natural language and directly run those specifications, annotating a handful of examples in the process. A user interface based on this new paradigm will also be presented.
Beyond IE, inference capabilities could be extended, acquired and applied from other tasks, opening a new research avenue where entailment and downstream task performance improve in tandem.
Short Bio: Eneko Agirre is a Professor of Informatics and Head of HiTZ Basque Center of Language Technnology at the University of the Basque Country, UPV/EHU, in San Sebastian, Spain. He has been active in Natural Language Processing and Computational Linguistics for decades. He received the Spanish Informatics Research Award in 2021, and is one of the 74 fellows of the Association of Computational Linguistics (ACL). He was President of ACL's SIGLEX, member of the editorial board of Computational Linguistics, Journal of Artificial Intelligence Research and Action editor for the Transactions of the ACL. He is co-founder of the Joint Conference on Lexical and Computational Semantics (*SEM). Recipient of three Google Research Awards and five best paper awards and nominations. Dissertations under his supervision received best PhD awards by EurAI, the Spanish NLP society and the Spanish Informatics Scientific Association. He has over 200 publications across a wide range of NLP and AI topics. His research spans topics such as Word Sense Disambiguation, Semantic Textual Similarity, Unsupervised Machine Translation and resources for Basque. Most recently his research focuses on inference and deep learning language models.
Read more →User-centric Natural Language Understanding
Abstract: People express themselves in different ways – due to their individual characteristics, communication goals, cultural background, affinity to various sociodemographic groups, or just as a matter of personal style. Leveraging these differences can be beneficial for NLP applications. In this talk, I explore methods for interpreting the language together with its user-dependent aspects – personal history, beliefs, and social environment – and their effect on social NLP tasks.
Short Bio: Lucie Flek is an Associate Professor at the Philipps-Universität Marburg, leading the research group on Conversational AI and Social Analytics. She has been investigating how various individuals and sociodemographic groups differ in their language use, and how this variation can be in return used in machine learning tasks to predict in-group behavior of interest. Previously, she has managed natural language understanding research programs in Amazon Alexa. In her academic work at TU Darmstadt, Positive Psychology Center at University of Pennsylvania, and University College London, she has been focusing on psychological and social applications of stylistic variation insights. She has served as Area Chair for Computational Social Sciences at multiple ACL* conferences, and is a co-organizer of the Stylistic Variation workshop and Widening NLP. Before her research path in natural language processing, Lucie has been contributing to particle physics research in the area of axion searches.
Read more →Natural Language Processing for Industrial Financial Predictive Analysis and Stock Trading
Abstract: In the financial industry, risk modelling, trading strategy design, and profit generation heavily rely on accurately predicting stock movements. Stock movements are influenced by varied factors beyond the conventionally studied historical prices, such as social media and correlations among stocks. The rising ubiquity of online content and knowledge mandates an exploration of models that factor in such multimodal signals for accurate stock forecasting. In this talk, I introduce a set of modern AI and NLP-centric methods and techniques using alternate sources of data - social media text, financial disclosures, documents, and multimodal data such as audio from financial earnings calls for building financial models in the industry to trade stocks and cryptocurrency. I then delve into the architecture of these models - covering multimodal, sequential, and graph neural networks - and analyse them across a diverse spectrum of metrics through an industry lens - quantitative performance, profitability, qualitative analysis, computational complexity, gender bias, and improvements over conventional financial predictive analysis methods.
Short Bio: Ramit Sawhney is an engineering manager at Tower Research Capital, and a research associate at the Georgia Institute of Technology, AI Institute at the University of Southern Carolina, and the University of Marburg. Ramit's primary interests across his industrial and academic roles lie in Quantitative Finance, Natural Language Processing, and Deep Learning. Ramit's work in these areas has been presented at top-tier NLP and AI conferences including ACL, EMNLP, NAACL, AAAI, SIGIR, WSDM, WWW, EACL, ICASSP, IJCAI and more. Ramit also serves as an organizer, reviewer, and host across a wide variety of industry-focused research initiatives and conferences. Ramit started his career as a software engineer at Tower Research Capital after completing his undergraduate studies, and since then has focused on building technical infrastructure for financial use-cases.
Read more →The proceedings of NLDB 2022 are available here: https://link.springer.com/book/10.1007/978-3-031-08473-7. Free access to the proceedings will be granted for 4 weeks, starting from 13 June 2022. We plan to publish the extended version of the best papers in a journal.
# | Info | Hotel | Contact information | Distance |
---|---|---|---|---|
1 | Urban and near | RENASA | https://sweethotelrenasa.com/ | 11’ walking |
2 | University Residence | GALILEO | https://www.galileogalilei.com/ | 5’ walking |
3 | University Residence | COLEGIO MAYOR CONCEPCIÓN | https://www.resa.es/es/residencias/valencia/residencia-universitaria-la-concepcion/residencia/ | 18’ walking, 19’ by bus |
4 | In port avenue | NH | https://www.nh-hoteles.es/hotel/nh-ciudad-de-valencia | 38’ walking , 29’ by bus |
5 | Beach | NEPTUNO | http://www.neptuno-hotel-valencia.com/index_es.htm | 48’ walking , 26’ by tram |
6 | Beach | SOL Y PLAYA | https://hotelsolplaya.com/ | 48’ walking , 26’ by tram |
7 | Beach | LAS ARENAS | https://www.hotelvalencialasarenas.com/ | 48’ walking , 26’ by tram |
8 | City center | PLAZA MERCADO | https://myrhotels.com/hoteles/hotel-spa-plaza-mercado/ | 44’ walking , 28’ by bus |
9 | City center | PETIT PALACE | https://www.petitpalaceplazadelareina.com/es/ | 42’ walking , 22’ by bus |
10 | City center | CASUAL VINTAGE | https://casualhoteles.com/hoteles-valencia | 42’ walking , 23’ by bus |
11 | Old district | ANTIGUA | https://www.hostalam.com/ | 42’ walking , 27’ by tram |
12 | Old district | MORELLANA | https://monsuitescatedral.com/ | 41’ walking , 26' by tram |
13 | Near Arts and Sciece City (www.cac.es) | SILKEN | https://www.hoteles-silken.com/es/hotel-puerta-valencia/ | 23’ walking , 25’ by bus |
14 | Near Arts and Sciece City (www.cac.es) | BARCELÓ | https://www.barcelo.com/es-es/barcelo-valencia/ | 36’ walking , 40’ by bus |