Text analysis in pharmaceutical industry - Case Study

Case Studies / Global Pharmaceutical Company

PHARMACEUTICAL · PHARMACOVIGILANCE & MARKET INTELLIGENCE

Automated multilingual text analysis for adverse event detection

A global Swiss pharmaceutical company needed to process massive volumes of multilingual medical texts, social media posts, and patient feedback – identifying adverse drug events, detecting medical entities, and gauging market sentiment. We built the NLP pipeline that automated what teams had been doing manually.

Client: Global Swiss pharmaceutical company (under NDA) – multinational enterprise operating across European and global markets.

KEY RESULTS

5 langs

Multilingual pipeline: German, English, French, Italian, Spanish

Automated

Adverse drug event detection from unstructured online sources

GDPR

Full de-identification of patient and physician data

Real-time

Market sentiment tracking before and during drug launches

INDUSTRY

Pharmaceutical

USE CASE

Pharmacovigilance & text analysis

AI APPROACH

NLP, NER, sentiment analysis

DATA SOURCES

Medical notes, social media, online posts

LANGUAGES

DE, EN, FR, IT, ES

ENGAGEMENT

Multi-phase delivery with partner

Automated multilingual text analysis for adverse event detection - Pharmaceutical · Pharmacovigilance & Market Intelligence

The challenge

In pharmaceutical development and post-market surveillance, the volume of text-based data is enormous – patient feedback, medical professional notes, social media discussions, regulatory documents. For this global Swiss pharmaceutical company, teams were reviewing and classifying these texts manually, across five languages, under strict accuracy and compliance requirements.

The work was slow, inconsistent, and couldn’t scale. Meanwhile, critical signals – adverse drug events, shifts in public sentiment around a new drug, mentions of unknown side effects – were buried in unstructured text that no team could realistically process in full.

The core problem: critical pharmacovigilance and market intelligence data was locked inside millions of unstructured texts in five languages. Manual review couldn’t keep up with the volume, and sensitive health information made automation non-trivial – every pipeline had to be GDPR-compliant from the start.

What we built

Together with our partner, we delivered a series of interconnected NLP capabilities – each addressing a specific gap in the client’s text analysis workflow.

Medical entity detection. We built automated recognition of medical entities – drug names, active substances, dosage forms, and related terms – across all five languages. The system handled the full complexity of pharmaceutical nomenclature, including brand names, generics, and informal references used in patient-generated content.

De-identification of medical texts. Before any analysis could happen, patient and physician data had to be removed. We implemented automated anonymization pipelines that stripped personally identifiable information from medical notes and records, making downstream analysis possible while maintaining full GDPR compliance.

Adverse drug event detection. One of the most impactful components: we developed an approach to identify adverse events associated with specific drugs by analyzing opinions and posts published online. This moved pharmacovigilance from reactive manual review to proactive automated monitoring – catching signals that manual processes would miss entirely.

Sentiment analysis for drug launches. We built sentiment tracking across social media channels to gauge public perception of a drug before and during its market launch. This gave the client real-time market intelligence – how patients, caregivers, and healthcare professionals were actually talking about their products.

Multilingual preprocessing. Each of these capabilities required robust handling of five languages, including the specific challenges of social media text: abbreviations, colloquial language, medical slang, and inconsistent formatting. We developed specialized preprocessing and NLP algorithms that could normalize and interpret these variations reliably.

The results

BEFORE

Manual review of multilingual medical texts. Adverse events detected reactively, if at all. No systematic social media monitoring. GDPR compliance handled case-by-case.

AFTER

Automated pipelines processing texts across five languages. Proactive adverse event detection. Real-time sentiment tracking. Built-in de-identification for all medical data.

The pharmaceutical company gained the ability to process medical data at scale – something that was previously impossible with manual workflows. Unknown adverse events could now be surfaced from online sources automatically. The frequency of known adverse events could be verified against real-world data rather than relying solely on clinical reports.

Market launch teams got a direct, real-time view into how their drugs were being discussed publicly – enabling faster response to emerging concerns and more informed commercial decisions.

Most critically, all of this happened within a framework that respected patient privacy from the ground up, with automated de-identification ensuring GDPR compliance wasn’t an afterthought but a built-in capability of every pipeline.

Technology used

Natural Language Processing Named Entity Recognition Sentiment Analysis Text De-identification Multilingual NLP
Adverse Event Detection Social Media Analysis Python

More Case Studies

See how we’ve helped other companies

Predictive maintenance car manufacturer

AUTOMOTIVE (UNDER NDA)

Replacing reactive maintenance with a predictive maintenance roadmap

Unplanned equipment failures were causing costly production stops. We assessed the manufacturer’s infrastructure and delivered a concrete architecture for predictive maintenance, a roadmap from reactive repairs to data-driven prevention.

Predictive Maintenance Consulting Architecture

PdM

Architecture

Roadmap

Delivered

MANUFACTURING · RADAWAY

Making email-based order processing reliable with LLMs

Staff were manually reading customer emails, identifying products, and entering orders by hand. We turned a promising AI prototype into a production system that handles it end to end, across languages, formats, and attachments.

LLM Semantic Matching Prompt Engineering

-90%

Manual intervention

95%+

Match accuracy

Cutting security questionnaire completion from one month to one week with GenAI

MEDTECH · APOQLAR

Cutting security questionnaire completion from one month to one week

Every new hospital client required a completed questionnaire, 8–10 people, about a month, pulling answers from policies across departments. Now an AI assistant drafts responses automatically, saving an estimated $90K/year.

RAG Azure OpenAI Compliance

–75%

Completion time

$90K

Annual savings

Tell us which process is costing you the most

We start with a focused process analysis – you see exactly what’s possible before committing to project implementation.






    Data Controller Information: The controller of your personal data is theBlue.ai GmbH, headquartered in Hamburg, Germany. By submitting this form, you consent to the processing of your personal data for the purpose of responding to your inquiry. You may withdraw your consent at any time, without affecting the lawfulness of processing based on consent before its withdrawal. Based on our legitimate interest, we may also send you information about our services and solutions, but only if it relates to the topic of your message. If you prefer not to receive such communications, you have the right to object at any time. For more details on how we handle your personal data and your rights, please refer to our Information Clause and Privacy Policy.

    * Required fields.