· Luisa Zeppelin · Case Study  · 3 min read

A data-driven web app that recommends collaborations and funding opportunities for YERUN researchers

CONNECT BY YERUN: AI-powered research collaboration platform

A data-driven web app that recommends collaborations and funding opportunities for YERUN researchers

Project: CONNECT BY YERUN
Role: Data Scientist, Frontend Developer, UX Designer
Client: SDU RIO, University of Southern Denmark
Context: A data-driven platform for 23 YERUN member universities, enabling researchers to discover collaborations, funding opportunities, and publications.
Outcome: A scalable, AI-powered web app that simplifies complex data through hierarchical clustering and interactive visualizations.

CONNECT BY YERUN landing page The landing page of CONNECT BY YERUN, designed to foster research collaborations.

The challenge

YERUN (Young European Research Universities Network) required a platform to empower researchers in their network to:

  • Discover collaborations with complementary expertise
  • Find funding opportunities tailored to their profiles
  • Navigate vast databases (e.g., ORCID, CORDIS) effortlessly

The core challenge: The sheer volume of publications (thousands of entries) made it nearly impossible for users to manually scan and find relevant content. The platform needed a way to organize and visualize this complexity to navigate the publications intuitively by topic.

My process

Since CONNECT BY YERUN shares a similar purpose to WaveLinks, I reused the Figma design from WaveLinks as a starting point, adapting it to YERUN’s branding and needs. This ensured consistency while accelerating the design phase.

👉 See my post on WaveLinks: A web platform fostering collaboration and knowledge-sharing for ocean restoration

Data science: Taming the complexity

To make the publication database manageable, I developed a multi-layered clustering pipeline using NLP and LLM technologies:

  1. Hierarchical clustering:

    • Applied NLP algorithms to group publications into topics and subtopics.
    • Repeated the clustering process recursively until clusters were small enough for users to explore.
    • Used LDA (Latent Dirichlet Allocation) for topic modeling.
  2. Automated topic naming:

    • Leveraged a local LLM (via Ollama) to generate human-readable names for each cluster.
    • Implemented asynchronous processing to handle large datasets efficiently and even locally.
  3. Pipeline optimization:

    • Built a Django command for end-to-end clustering, with customizable parameters (e.g., sampling size, model name, cluster size).
    • Used multiprocessing and Joblib to optimize performance for large datasets.
    • Designed the pipeline to save intermediate results and clean up disk space automatically.

Technologies used: Python, Concurrent.futures, Joblib, Asyncio, LDA Modeling, Ollama Python Client, JSON Handling (orjson).

Frontend: Interactive visualization

To make the clustered data user-friendly, I designed a custom interactive visualization using React and D3.js:

  • Hierarchical diagram:

    • The left side displays the parent topic, while the right side shows its subtopics.
    • Users can click on subtopics to dive deeper into the hierarchy or click the parent to move up a layer.
    • At the deepest layer, users see a list of publications with key citation information and links to detail views.
  • Technical implementation:

    • React for UI rendering and state management
    • D3.js for a dynamic, interactive diagram
    • Next.js for seamless routing/navigation
    • CSS for styling and responsiveness

CONNECT BY YERUN topic clustering visualization The interactive D3.js diagram, allowing users to explore hierarchical topic clusters.

The visualization's interactiveness Subtopics are available and explorable. Here, the tooltip shows the hierarchy of topic clusters.

Key features

FeatureMy RoleImpact
Hierarchical clusteringDesigned and implemented the NLP pipelineMade thousands of publications navigable
LLM topic namingIntegrated Ollama for automated namingImproved usability with human-readable labels
Interactive visualizationBuilt the D3.js/React componentEnabled intuitive exploration of complex data
Reusable designAdapted WaveLinks’ Figma designAccelerated development and ensured consistency
Optimized pipelineImplemented multiprocessing and asyncHandled large datasets efficiently

The Outcome

CONNECT BY YERUN is now a live platform that:

  • Recommends collaborations and funding based on user profiles
  • Simplifies data exploration through hierarchical clustering and interactive visualizations
  • Integrates multiple data sources (ORCID, CORDIS) into a single, user-friendly interface
  • Empowers researchers to take control of their collaboration and funding pursuits

The platform is especially useful to researchers from YERUN’s 23 member universities. However, all researchers with an ORCID ID can log in.

Explore CONNECT BY YERUN

Back to project list

Related projects

View all projects »