About

Education and Research

I received my diploma from the Department of Electrical and Computer Engineering, Democritus University of Thrace, Greece in 2017, and right after, I started pursuing a Ph.D. degree entitled “Data Science for Environmental Applications”. I successfullt defended my Ph.D. on 2022.

During my Ph.D. time, I was a Teaching Assistant (Databases, Data Mining, Machine Learning). On top of that I was working as a Research Associate, where I am still working for more than 8 years. I have published 15 scientific papers being cited more that 1000 times. During this time, I am honing my skills in data aggregation, preprocessing, visualization, machine learning, feature selection, natural language processing, clustering techniques and generative AI.

My time at Mathisys Technologies

In 2019-2020 I fulfilled my military service, while in that year I landed my first Data Science job in a Quantitative and Algorithmic Trading Research company.
During my time there, I have contributed significantly to my company’s success by implementing Machine Learning from scratch, building Data Science Pipelines, and developing and evaluating Machine Learning models.

One of my key achievements was introducing Machine Learning to the company. I built everything from scratch, including algorithms, models, and tools. This significantly improved the company’s ability to analyze data and make informed decisions.
I also developed Data Science Pipelines that automated various procedures, including Data Aggregation, Pre-processing, Feature Engineering and Selection, Model Training and Evaluation, Analysis, and Simulation. These pipelines have saved hours of manual work for the team, allowing them to focus on other critical tasks.

My analytical skills were further evidenced by my ability to analyze data using Interactive Visualizations. I developed tools that were used daily by the team, and these tools helped to explain complex data in an easy-to-understand manner.
I also developed and evaluated various Machine Learning models, including Linear Regression, Random Forest, Light Gradient Boosting, Logistic Regression, and K-means. I was explaining the outcomes of Classification and Regression Machine Learning models, as well as the interactions between predictors and targets.
My experience and expertise have allowed me eventually to supervise and mentor other Data Scientists, demonstrating my ability to collaborate and communicate effectively with colleagues.

My time at Mobileum

My journey in the company lasted 2 years and 7 months. After this period, I successfully landed my second Data Science job in 2022, where I focused on solving complex problems related to anomaly detection, time series forecasting, and performance optimization. My role involved analyzing large and intricate datasets to extract insights and drive actionable solutions for clients.

One of my primary responsibilities was applying Anomaly Detection techniques to identify irregularities and outliers in client data. By implementing Root Cause Analysis, I was able to pinpoint the underlying factors causing these anomalies, helping teams resolve issues efficiently and maintain system reliability. My contributions in this area enabled clients to gain a clearer understanding of their data quality and operational performance.

Another key area of focus was Time Series Forecasting. I researched and applied statistical methods and regression techniques to forecast Key Performance Indicators (KPIs) with a high degree of accuracy.
My work ensured that critical business metrics were predicted reliably, allowing clients to make informed, data-driven decisions.
Additionally, I explored and evaluated new project opportunities by performing deep exploratory data analysis, delivering comprehensive insights that validated project viability and business value.

Performance optimization was another challenge I took on. I enhanced the efficiency of existing production workflows by optimizing code, achieving a remarkable 53% improvement in execution speed. This optimization not only reduced resource consumption but also accelerated processing times, significantly improving overall productivity for the team.

Throughout my time in this role, I immersed myself in exploratory data analysis (EDA), developing a deep understanding of datasets and uncovering trends that were otherwise overlooked. I collaborated closely with cross-functional teams to communicate my findings and ensure that stakeholders could leverage these insights effectively.
This role allowed me to refine my skills in statistical modeling, anomaly detection, and time series analysis while reinforcing my ability to deliver impactful results in a fast-paced business environment.

By the end of my journey at the company, which lasted 1 year and 1 month, I had strengthened my technical expertise, improved production processes, and contributed to the company’s success through a combination of innovation, critical thinking, and a commitment to solving challenging problems.

My time at Agroknow

I began my next chapter as a Data Scientist in 2023, where I shifted my focus to solving challenges in forecasting, classification, and large-scale data analysis. My role was pivotal in driving innovation and delivering impactful solutions for real-world problems.

One of my primary achievements was improving the performance of time series forecasting models. I introduced advanced pre-processing, feature engineering, and feature selection techniques, which significantly enhanced the pipeline’s accuracy and scalability.
By optimizing and tuning the forecasting process, I successfully improved the performance of over 50,000 models by 87%, a result that brought considerable value to the team and stakeholders.

In addition to forecasting, I played a central role in the development of a Large Language Model (LLM) tailored to address client-specific queries. This model, which I built and fine-tuned, was designed to assist with answering questions about critical incidents and data points.
My work involved integrating natural language processing techniques to ensure the model could deliver clear, accurate, and insightful responses, ultimately improving the user experience and client satisfaction.

I also tackled highly imbalanced datasets by developing and implementing robust classification models. Leveraging techniques like resampling, feature engineering, and model tuning, I delivered solutions that were both accurate and reliable. These models were instrumental in extracting meaningful insights and enabling more informed decision-making for the team.

Throughout my time in this role, I regularly delivered product demos and facilitated webinars to showcase the capabilities of our models and tools to high-value clients. I enjoyed the opportunity to bridge the gap between technical work and stakeholder communication, ensuring that the value of our solutions was understood and appreciated.

This experience allowed me to further refine my skills in time series forecasting, natural language processing, and classification modeling while staying at the forefront of modern tools and methodologies. I worked on scalable, real-world solutions that had a measurable impact, solidifying my role as a results-driven and innovative Data Scientist.

My time at NTT Data

In 2025, I joined NTT Data as a Senior AI Engineer, where I focused on building advanced AI solutions for large-scale skill and occupation classification. My role centered around designing, deploying, and optimizing complex NLP pipelines to support workforce upskilling and personalized career recommendations within the Europass platform and ESCO ecosystem.

One of my most significant contributions was designing and deploying two production-grade pipelines for skill and occupation extraction. These pipelines consisted of 3- and 5-model workflows, including components like text splitting, semantic skill mapping, classification, and cross-encoder ranking. By optimizing the models, I improved inference speed and reduced infrastructure costs, while scaling the system to process 1M+ profiles daily5% accuracy.

A key highlight of my work was the development of a Skill Gap Analysis engine, which compares a user’s current skills with the requirements of a target occupation. This solution identifies missing skills and computes a readiness score, empowering users to understand their career progression and bridging skill gaps effectively. By integrating this system into Europass, we increased the accuracy of career recommendations, directly impacting thousands of users across Europe.

To ensure adaptability for diverse client needs, I built configurable APIs and pipeline tooling that allowed clients to customize model selection and thresholds. This flexibility reduced integration time by 50% and streamlined deployment across multiple environments. I also implemented a model patching mechanism, enabling seamless updates to production systems with zero downtime.

Beyond core pipeline development, I contributed to the automation of quarterly data processing and reporting, which involved handling 300GB+ data dumps. My improvements reduced analytics turnaround time by 95%, accelerating stakeholder decision-making and improving operational efficiency.

This role at NTT Data allowed me to combine technical innovation with measurable business impact. It strengthened my expertise in transformer-based NLP, scalable system design, and data-driven personalization, while giving me the opportunity to deliver solutions that directly improved career development tools at a European scale.

I love data. I am passionate about Data Science. I learn every day in this exciting, ever changing field. As I learn, I try to write blog posts to document my journey on Medium and my personal website. The code for most of my projects is available on GitHub, where my Data Science Portfolio is also located.

At the beginning of my Data Science journey, I was competing in Kaggle. Because, how can you learn if you don’t get your hands dirty?

You can contact me on LinkedIn.

On my free time I read, play board games, paint miniatures, create dioramas, attend concerts watch tv shows and movies, play computer games and the piano. I also travel as much as possible.

Dimitris Effrosynidis

Hard Skills