Sie haben Javascript deaktiviert!
Sie haben versucht eine Funktion zu nutzen, die nur mit Javascript möglich ist. Um sämtliche Funktionalitäten unserer Internetseite zu nutzen, aktivieren Sie bitte Javascript in Ihrem Browser.

Perspektivenwechsel. Bildinformationen anzeigen


Foto: Universität Paderborn

Joschka Kersting, M.Sc.

 Joschka Kersting, M.Sc.

Sonderforschungsbereich 901

Wissenschaftlicher Mitarbeiter

+49 5251 60-5669

Bitte leiten Sie Hauspost über Frau Saage oder Frau Parisi an mich.

 Joschka Kersting, M.Sc.
04/2020 - heute

Wissenschaftlicher Mitarbeiter im Sonderforschungsbereich 901 (SFB 901): On-The-Fly Computing

Universität Paderborn

04/2018 - 03/2020

Wissenschaftlicher Mitarbeiter an der Professur für Digitale Kulturwissenschaften, AG Semantische Informationsverarbeitung

Prof. Dr. Michaela Geierhos
Universität Paderborn

12/2017 - 03/2018

Wissenschaftliche Hilfskraft an der Professur für Digitale Kulturwissenschaften

Prof. Dr. Michaela Geierhos
Universität Paderborn

10/2015 - 03/2018

Masterstudium Management Information Systems

Universität Paderborn

04/2016 - 11/2017

Wissenschaftliche Hilfskraft an der Juniorprofessur für Wirtschaftsinformatik, insb. Semantische Informationsverarbeitung

Jun.-Prof. Dr. Michaela Geierhos
Heinz Nixdorf Institut
Universität Paderborn

10/2016 - 12/2016

Forschungsaufenthalt am KISTI

Korea Institute of Science and Technology Information (KISTI)
Daejeon, Republik Korea (Südkorea)

10/2012 - 09/2015

Duales Bachelorstudium International Business

Fachhochschule der Wirtschaft (FHDW), staatl. anerkannte private Hochschule, Paderborn
u. a. in der CRM-IT von arvato (Bertelsmann)

01/2014 - 02/2014

Auslandspraktikum in Moskau

Praktikum bei einem internationalen Modeunternehmen in Moskau, Russland

Liste im Research Information System öffnen


Human Language Comprehension in Aspect Phrase Extraction with Importance Weighting

J. Kersting, M. Geierhos, in: Natural Language Processing and Information Systems, Springer, 2021, pp. N.N.

In this study, we describe a text processing pipeline that transforms user-generated text into structured data. To do this, we train neural and transformer-based models for aspect-based sentiment analysis. As most research deals with explicit aspects from product or service data, we extract and classify implicit and explicit aspect phrases from German-language physician review texts. Patients often rate on the basis of perceived friendliness or competence. The vocabulary is difficult, the topic sensitive, and the data user-generated. The aspect phrases come with various wordings using insertions and are not noun-based, which makes the presented case equally relevant and reality-based. To find complex, indirect aspect phrases, up-to-date deep learning approaches must be combined with supervised training data. We describe three aspect phrase datasets, one of them new, as well as a newly annotated aspect polarity dataset. Alongside this, we build an algorithm to rate the aspect phrase importance. All in all, we train eight transformers on the new raw data domain, compare 54 neural aspect extraction models and, based on this, create eight aspect polarity models for our pipeline. These models are evaluated by using Precision, Recall, and F-Score measures. Finally, we evaluate our aspect phrase importance measure algorithm.

    Towards Aspect Extraction and Classification for Opinion Mining with Deep Sequence Networks

    J. Kersting, M. Geierhos, in: Natural Language Processing in Artificial Intelligence -- NLPinAI 2020, Springer, 2021, pp. 163--189

    This chapter concentrates on aspect-based sentiment analysis, a form of opinion mining where algorithms detect sentiments expressed about features of products, services, etc. We especially focus on novel approaches for aspect phrase extraction and classification trained on feature-rich datasets. Here, we present two new datasets, which we gathered from the linguistically rich domain of physician reviews, as other investigations have mainly concentrated on commercial reviews and social media reviews so far. To give readers a better understanding of the underlying datasets, we describe the annotation process and inter-annotator agreement in detail. In our research, we automatically assess implicit mentions or indications of specific aspects. To do this, we propose and utilize neural network models that perform the here-defined aspect phrase extraction and classification task, achieving F1-score values of about 80% and accuracy values of more than 90%. As we apply our models to a comparatively complex domain, we obtain promising results.

      Well-being in Plastic Surgery: Deep Learning Reveals Patients' Evaluations

      J. Kersting, M. Geierhos, in: Proceedings of the 10th International Conference on Data Science, Technology and Applications (DATA 2021), SCITEPRESS, 2021, pp. N.N.


      Neural Learning for Aspect Phrase Extraction and Classification in Sentiment Analysis

      J. Kersting, M. Geierhos, in: Proceedings of the 33rd International Florida Artificial Intelligence Research Symposium (FLAIRS) Conference, AAAI, 2020, pp. 282--285

      Aspect Phrase Extraction in Sentiment Analysis with Deep Learning

      J. Kersting, M. Geierhos, in: Proceedings of the 12th International Conference on Agents and Artificial Intelligence (ICAART 2020) -- Special Session on Natural Language Processing in Artificial Intelligence (NLPinAI 2020), SCITEPRESS, 2020, pp. 391--400

      This paper deals with aspect phrase extraction and classification in sentiment analysis. We summarize current approaches and datasets from the domain of aspect-based sentiment analysis. This domain detects sentiments expressed for individual aspects in unstructured text data. So far, mainly commercial user reviews for products or services such as restaurants were investigated. We here present our dataset consisting of German physician reviews, a sensitive and linguistically complex field. Furthermore, we describe the annotation process of a dataset for supervised learning with neural networks. Moreover, we introduce our model for extracting and classifying aspect phrases in one step, which obtains an F1-score of 80%. By applying it to a more complex domain, our approach and results outperform previous approaches.

        What Reviews in Local Online Labour Markets Reveal about the Performance of Multi-Service Providers

        J. Kersting, M. Geierhos, in: Proceedings of the 9th International Conference on Pattern Recognition Applications and Methods, SCITEPRESS, 2020, pp. 263--272

        This paper deals with online customer reviews of local multi-service providers. While many studies investigate product reviews and online labour markets with service providers delivering intangible products “over the wire”, we focus on websites where providers offer multiple distinct services that can be booked, paid and reviewed online but are performed locally offline. This type of service providers has so far been neglected in the literature. This paper analyses reviews and applies sentiment analysis. It aims to gain new insights into local multi-service providers’ performance. There is a broad literature range presented with regard to the topics addressed. The results show, among other things, that providers with good ratings continue to perform well over time. We find that many positive reviews seem to encourage sales. On average, quantitative star ratings and qualitative ratings in the form of review texts match. Further results are also achieved in this study.

          Semantic Tagging of Requirement Descriptions: A Transformer-based Approach

          J. Kersting, F.S. Bäumer, in: Proceedings of the 17th International Conference on Applied Computing, IADIS, 2020, pp. 119--123

          Tag Me If You Can: Insights into the Challenges of Supporting Unrestricted P2P News Tagging

          F.S. Bäumer, J. Kersting, B. Buff, M. Geierhos, in: Information and Software Technologies, Springer, 2020, pp. 368--382

          Peer-to-Peer news portals allow Internet users to write news articles and make them available online to interested readers. Despite the fact that authors are free in their choice of topics, there are a number of quality characteristics that an article must meet before it is published. In addition to meaningful titles, comprehensibly written texts and meaning- ful images, relevant tags are an important criteria for the quality of such news. In this case study, we discuss the challenges and common mistakes that Peer-to-Peer reporters face when tagging news and how incorrect information can be corrected through the orchestration of existing Natu- ral Language Processing services. Lastly, we use this illustrative example to give insight into the challenges of dealing with bottom-up taxonomies.

            Detection of Privacy Disclosure in the Medical Domain: A Survey

            B. Buff, J. Kersting, M. Geierhos, in: Proceedings of the 9th International Conference on Pattern Recognition Applications and Methods (ICPRAM 2020), SCITEPRESS, 2020, pp. 630--637

            When it comes to increased digitization in the health care domain, privacy is a relevant topic nowadays. This relates to patient data, electronic health records or physician reviews published online, for instance. There exist different approaches to the protection of individuals’ privacy, which focus on the anonymization and masking of personal information subsequent to their mining. In the medical domain in particular, measures to protect the privacy of patients are of high importance due to the amount of sensitive data that is involved (e.g. age, gender, illnesses, medication). While privacy breaches in structured data can be detected more easily, disclosure in written texts is more difficult to find automatically due to the unstructured nature of natural language. Therefore, we take a detailed look at existing research on areas related to privacy protection. Likewise, we review approaches to the automatic detection of privacy disclosure in different types of medical data. We provide a survey of several studies concerned with privacy breaches in the medical domain with a focus on Physician Review Websites (PRWs). Finally, we briefly develop implications and directions for further research.


              Natural Language Processing in OTF Computing: Challenges and the Need for Interactive Approaches

              F.S. Bäumer, J. Kersting, M. Geierhos, Computers (2019), 8(1)

              The vision of On-the-Fly (OTF) Computing is to compose and provide software services ad hoc, based on requirement descriptions in natural language. Since non-technical users write their software requirements themselves and in unrestricted natural language, deficits occur such as inaccuracy and incompleteness. These deficits are usually met by natural language processing methods, which have to face special challenges in OTF Computing because maximum automation is the goal. In this paper, we present current automatic approaches for solving inaccuracies and incompletenesses in natural language requirement descriptions and elaborate open challenges. In particular, we will discuss the necessity of domain-specific resources and show why, despite far-reaching automation, an intelligent and guided integration of end users into the compensation process is required. In this context, we present our idea of a chat bot that integrates users into the compensation process depending on the given circumstances.

              In Reviews We Trust: But Should We? Experiences with Physician Review Websites

              J. Kersting, F.S. Bäumer, M. Geierhos, in: Proceedings of the 4th International Conference on Internet of Things, Big Data and Security, SCITEPRESS, 2019, pp. 147-155

              The ability to openly evaluate products, locations and services is an achievement of the Web 2.0. It has never been easier to inform oneself about the quality of products or services and possible alternatives. Forming one’s own opinion based on the impressions of other people can lead to better experiences. However, this presupposes trust in one’s fellows as well as in the quality of the review platforms. In previous work on physician reviews and the corresponding websites, it was observed that there occurs faulty behavior by some reviewers and there were noteworthy differences in the technical implementation of the portals and in the efforts of site operators to maintain high quality reviews. These experiences raise new questions regarding what trust means on review platforms, how trust arises and how easily it can be destroyed.


                Towards a Multi-Stage Approach to Detect Privacy Breaches in Physician Reviews

                F.S. Bäumer, J. Kersting, M. Orlikowski, M. Geierhos, in: Proceedings of the Posters and Demos Track of the 14th International Conference on Semantic Systems co-located with the 14th International Conference on Semantic Systems (SEMANTiCS 2018),, 2018

                Physician Review Websites allow users to evaluate their experiences with health services. As these evaluations are regularly contextualized with facts from users’ private lives, they often accidentally disclose personal information on the Web. This poses a serious threat to users’ privacy. In this paper, we report on early work in progress on “Text Broom”, a tool to detect privacy breaches in user-generated texts. For this purpose, we conceptualize a pipeline which combines methods of Natural Language Processing such as Named Entity Recognition, linguistic patterns and domain-specific Machine Learning approaches which have the potential to recognize privacy violations with wide coverage. A prototypical web application is openly accesible.

                Rate Your Physician: Findings from a Lithuanian Physician Rating Website

                F.S. Bäumer, J. Kersting, V. Kuršelis, M. Geierhos, in: Communications in Computer and Information Science, Springer, 2018, pp. 43-58

                Physician review websites are known around the world. Patients review the subjectively experienced quality of medical services supplied to them and publish an overall rating on the Internet, where quantitative grades and qualitative texts come together. On the one hand, these new possibilities reduce the imbalance of power between health care providers and patients, but on the other hand, they can also damage the usually very intimate relationship between health care providers and patients. Review websites must meet these requirements with a high level of responsibility and service quality. In this paper, we look at the situation in Lithuania: Especially, we are interested in the available possibilities of evaluation and interaction, and the quality of a particular review website measured against the available data. We thereby identify quality weaknesses and lay the foundation for future research.


                  Using Sentiment Analysis on Local Up-to-the-Minute News: An Integrated Approach

                  J. Kersting, M. Geierhos, in: Information and Software Technologies: 23rd International Conference, ICIST 2017, Druskininkai, Lithuania, October 12–14, 2017, Proceedings, Springer, 2017, pp. 528-538

                  In this paper, we present a search solution that makes local news information easily accessible. In the era of fake news, we provide an approach for accessing news information through opinion mining. This enables users to view news on the same topics from different web sources. By applying sentiment analysis on social media posts, users can better understand how issues are captured and see people’s reactions. Therefore, we provide a local search service that first localizes news articles, then visualizes their occurrence according to the frequency of mentioned topics on a heatmap and even shows the sentiment score for each text.

                    Privacy Matters: Detecting Nocuous Patient Data Exposure in Online Physician Reviews

                    F.S. Bäumer, N. Grote, J. Kersting, M. Geierhos, in: Information and Software Technologies: 23rd International Conference, ICIST 2017, Druskininkai, Lithuania, October 12–14, 2017, Proceedings, Springer, 2017, pp. 77-89

                    Consulting a physician was long regarded as an intimate and private matter. The physician-patient relationship was perceived as sensitive and trustful. Nowadays, there is a change, as medical procedures and physicians consultations are reviewed like other services on the Internet. To allay user’s privacy doubts, physician review websites assure anonymity and the protection of private data. However, there are hundreds of reviews that reveal private information and hence enable physicians or the public to identify patients. Thus, we draw attention to the cases when de-anonymization is possible. We therefore introduce an approach that highlights private information in physician reviews for users to avoid an accidental disclosure. For this reason, we combine established natural-language-processing techniques such as named entity recognition as well as handcrafted patterns to achieve a high detection accuracy. That way, we can help websites to increase privacy protection by recognizing and uncovering apparently uncritical information in user-generated texts.

                      Internet of Things Architecture for Handling Stream Air Pollution Data

                      J. Kersting, M. Geierhos, H. Jung, T. Kim, in: Proceedings of the 2nd International Conference on Internet of Things, Big Data and Security, SCITEPRESS, 2017, pp. 117-124

                      In this paper, we present an IoT architecture which handles stream sensor data of air pollution. Particle pollution is known as a serious threat to human health. Along with developments in the use of wireless sensors and the IoT, we propose an architecture that flexibly measures and processes stream data collected in real-time by movable and low-cost IoT sensors. Thus, it enables a wide-spread network of wireless sensors that can follow changes in human behavior. Apart from stating reasons for the need of such a development and its requirements, we provide a conceptual design as well as a technological design of such an architecture. The technological design consists of Kaa and Apache Storm which can collect air pollution information in real-time and solve various problems to process data such as missing data and synchronization. This enables us to add a simulation in which we provide issues that might come up when having our architecture in use. Together with these issues, we state r easons for choosing specific modules among candidates. Our architecture combines wireless sensors with the Kaa IoT framework, an Apache Kafka pipeline and an Apache Storm Data Stream Management System among others. We even provide open-government data sets that are freely available.

                        Liste im Research Information System öffnen

                        Die Universität der Informationsgesellschaft