Beyond Hype: The Practical Evolution of Data Science with Generative AI

Video size:

Abstract

Data science has always embraced innovation, but the emergence of generative AI presents a truly transformative shift. This talk goes beyond the hype to explore how generative AI is changing the day-to-day work of data scientists.

Summary

Transcript

This transcript was autogenerated. To make changes, submit a PR.

My name is Oma and I'm joined by my colleague Nain. We're thrilled to be here at the Con 42 lms discussing one of the most exciting and transformative topics in our field. Generative ai. Data science has always embraced innovation, but the emergence of generative AI presents a truly transformative shift. Before we dive in, we're gonna quickly introduce ourselves. I'm a platform engineer IBM within the client engineering team, working mainly on automation pilots helping clients achieve their transformation goals. Nin. Hi everyone. I'm Nin. I am data scientist in IBM Cloud Engineering. Same team as O Omar and my daily does involve trying to understand client's problem, trying to solve them with data and ai, and now while trying to solve them with Ative ai. Thank you. Darin. Before we dive in, I'm gonna quickly just walk you through what we've been covering today. I'll be. First talk, walking through the evolving landscape of data science. I'll then pass on to Darin, who will share her practical experience with genes and then her the impact of Genes VI on data scientists. And I'll then look at the future of data science with genes and finally share some actionable next steps at the end. yeah, let's dive in. Generative AI didn't appear overnight, as you can see from the chat on the right. Is market size, has shuttle over the past few years and will continue to do so over the next decade. However, several key technological breakthroughs have enabled its rapid rise in recent years from the introduction of transformer architecture to the access of massive data sets, and now even advances comput computational power like NVIDIA's, GPUs and TPU have made large scale AI trading more feasible. But how exactly is generative AI different from traditional AI that we've seen over the past few years? traditional AI focuses on analyzing existing data sets while generative AI is focusing on creating new content like your images, your text, even our music and code. And you are all accustomed this, accustomed to this with models like G PT 4.5, that can now even create synthetic data sets that preserve static statistical properties while also protecting pro privacy. And this is something that an army will touch upon with our experiences later on. However, to fully grasp the scale of this transformation, let's explore how AI workflows have shifted from this prediction to generation. I. For years, AI and data science primarily focused on prediction based models. classifying data, making forecasts and optimizing processes on based on historical patterns, right? However, with generative ai, we're moving beyond just prediction creation, like I mentioned before. synthetic data, image, text, all that good stuff. This shift, introducing a new way of thinking about AI workflows. Let's break it down. on the left side, we can see the data science lifecycle before this shift. And this, showcases the traditional data science lifecycle, which involves, your business understanding, defining the problem to be solved. Data mining and cleaning, gathering and pre-processing this data, feature engineering and exploration and understanding these patterns and relationships. Relationships and data. Predictive modeling, your traditional training machine learning models to make forecasts. And finally the data visualization, where you're presenting these insights to stakeholders and your clients and whether they might be, while this workflow is well established and all good, it's really centered around structured data and predictive ana analysis. Now, let's see how this is. Com is, Is in comparison with the right side, which looks at the generative AI workflow. Generative AI introduces a different approach focusing on creation. Like we mentioned before, we still have the business understanding, which is crucial. We need to know, how to, forecast this understanding of the data. But now we frame these problems in terms of generation, not just prediction. And prompt engineering. The new feature engineering, let's say, crafting prompts effectively to guide these AI models, we also have to give it relevant context. So instead of your traditional labeled data sets, gen AI relies on context embeddings and training data relevance. And we also have this. Fine tuning element, which, we fine tune these AI models, to give outposts to our domains that we, highlighted. And finally, we do this evaluation. New ways of assessing AI generated content, ensuring coherence, correctness, and ethical consideration. So just to summarize this, AI workflows are evolving from predictive analysis to generative capabilities. And understanding the shift is the key to leveraging AI's full potential in real world applications. More on that later. However, with every breakthrough comes concern, let's address some of these biggest challenges data science face in this new AI driven world, and even some of these workflows. This slide really addresses the elephant in the room. Many data scientists are genuinely worried about how genes AI might impact their careers. how we share, shared that the slide before on how it's really automated this process. Some might fear that is doing everything that, that we're doing essentially, but that's not the point. And some of the points that data scientists have put across is, These models can analyze complex data sets and produce insights in minutes that might have taken days previously that, we used to, without even seeing cases where companies are using models to, to replace custom machine learning models that data sets used to work on, for certain tasks like classification or sentiment analysis, for example. And, from the slide earlier, once again. And the routine tasks like, data cleaning and analysis, and even feature engineering are increasingly becoming more automated. While some fair AI could replace data scientists, the reality is quite different. I'm going to pass onto Nain who will showcase how generative AI is proving to be an incredibly powerful partner rather than a competitor over to, Thank you Umer. So there are a lot of concern related to generative AI and rightly Generative AI is a powerful technology that can do a lot of tasks we used to do. It can fill in the missing values in data, find anomalies in data, and even label the data. But think about it, it can do all of this mundane tasks, thus leaving us data scientists time to focus on actual model development. And even more, it can help us in accelerate the model development, for example, by suggesting us some hyper parameters. It can help us reach optimization faster. And let's be honest, how many times we have been in this scenario where we did not have enough data or the right data, but now that we have generative AI was able to generate data, we can bring ideas to life sooner. So I'm here to say that data scientists and JF AI are a powerful dynamic duo. And I'm gonna go through some of the project where data scientists have used in native AI to accelerate the project build and open opportunities for our clients. Let's dive in. Our first project, example is WA, is was with a finance client. The client had a marketing team who tested out different marketing strategy with different focus group. The problem was that finding right people for a focus group was difficult. And furthermore, the marketing team had to test all the strategy, multiple time with different focus group, and getting those people were hot overall. This was a very time consuming and resource intensive process. So how did we help them? While we use generative AI to, create customer personas on the writing side, you can see one of the example. We have Sandy, who is 28 years old, single mother, and she's a teacher assistant. This persona was generated by generative ai. And how did we achieve that? we had a team of data scientists, firstly, who identify appropriate model for this problem. We use IBM Grite model for this. Then the team gathered real customer data and fed it to the model. The generative AI was able to analyze the trends in behavior of the customer data and create the different personas. And lastly, we had the data team, again, validate the persona to make sure they represented or they were near the CU real customers. And now that we had a model able to generate personas, a client can simulate multiple, focus group. They can take out, try out their board and innovative strategy with this simulated focus group and thus open lot of opportunities for them. second example is something we all come across commonly not having enough data. for this, we had a client who dealt with heavy machinery on site and they were ready for, focus on the hoisting operation. They had large lift who offload and, unloaded heavy materials, and they were concerned that these lifts were not used efficiently and thus it was costing them. They was also concern that these, lists were used by personnel and they wanted to make sure, safety of these personnels. And they wanted our help in that. So how did we do it? Firstly, they gave us some sensor data and images of the site. And when we analyzed the, images, we realized we don't have enough data. So what we did, we augmented the images we use. IBM Max Visual Expect, max Maximum Visual Expect is a platform that have computer vision and deep learning and it allows you to train label. And deploy models. we had a data team who augmented these images. Of course, they choose what kind of augmentation appropriate for this project, and then we were able to create new images. Now that we had new created images and sensor data, we were able to simulate scenarios and identify future risk and thus have a threat, threat detection mechanism in place. And also with these scenarios, we were able to identify the most optimized way of using this lift, thus increasing efficiency and reducing cost for our client. Next example is not related to a project, but I do it for any project in IBM. We have to work with different domains. For example, one day we are working with retail. One day we are working with finance, and one day we are working with telecommunication and getting my head around the domain can be very overwhelming. I remember a particular project, it was for a telecommunication company. They gave us cell tower data and they wanted to predict future congestion, cell tower congestion. They also give us a large document and a mathematical formula on how they manually calculate congestion currently. And for me to understand those, that document was very difficult. It had so many terminologies I never heard of. what does QOS means? So over here I use generative AI to simplify the complex concept. I also use it to summarize the mathematical formula and make it more understandable for me and even give, do initial analysis of the data. And because of that, I got my head around the domain quite fast. Seeing that generative AI did help give me basic level of guidance, but I still had to use my expertise to make decision. For example, when I ask you what are the KPIs that I should use for cell tower congestion, it said, one of the kpi, like successful handshake, but the data in our, that we received wasn't relevant. So again, you still, I still had to use my expertise to make the right decision for the project. For our next project, and this is very close to me, it is, was with a company that dealt with contracts. So they wanted to use generative AI to generate clauses for the contracts. The way it happened was the company had templates for a contract, for example, it'll have a start date, end date, termination clause, culpable, and so forth. And based on the different client, they would alter the contract. For example, they might have a client that does not need ANM ity clause, so they won't need to add it in the contract. So they were, they wanted to use the native AI to make these changes. So how did we do it? We, again, we had a data team who provided generative AI with the temp, template of the contracts and write the appropriate prompt. What we were analyzing was, what we observed was that the generative AI was too wordy. It was adding extra information that wasn't needed. And again, over here we need data. scientists who kept on improving the result iterated throughout the project to get the final output. So we do definitely need data science. Don't be. Generative AI is a incredible, powerful and transformative technology revolution the way we work. But we still need the human oversight and expertise to get the most out of the generative ai. And we need the human creativity and insight to make appropriate decision for a project. And overall generative AI is here to elevate us, nor replace it. And now I will pass it on to ER, who will talk about the emerging trend in generative ai. Thank you, Darin. Now that we've touched upon some of the experiences and impact lessons with generative AI as data scientists, it's to see what the future holds and what's next with generative AI and emerging trends within it. One we're seeing quite often and more recent is the use of multimodal models, but for specific use cases, for instance, watsonx, our own data and AI platform is able to host many AI systems that can analyze both texts and images. and one example was with a project where we had to utilize medical reports to create more comprehensive DI diagnosis, and we were able to pick out the right. Multimodal model for that use case. Another one was the, another one. Trend is the reinforcement learning in Geneva. And this is where, reinforcement learning acts as a feed feedback mechanism that allows the generative model to continuously improve this output by learning from rewards and through trial and error essentially. and we're seeing this, for example, in a lot of these models like deep Seek and Chat GBT, that use reinforcement learning to improve responses over time. Offering more, a more comprehensive and, contextually relevant answer to the user. And finally, the most important one I think, is open source ai, the increased availability. And, I. General accessibility and collaboration to these models is what will create the gap and allow those to really make use of, the future of generative ai. and really the future of generative AI is going to be more interactive, adaptable, and multimodal, allowing new possibilities and applications across a variety of industries that we've covered. Now that we've looked ahead, the real question is, what should you do to stay ahead? As a call to action, I can say collaborate with ai, not against it, not compete with it, or rather to utilize it to invest in your skills, whether that can be in research or development, wherever it can be. Invest in your skills in both AI and utilize it to invest in your skills. Gene AI continues to evolve. It's clear that the future is full of possibilities. How will you embrace this change? And with that, thank you very much and we hope to see you again.

Slides

Download slides (PDF)

See all 40 talks at this event!

Conf42 Large Language Models (LLMs) 2025 - Online

March 20 2025 - premiere 5PM GMT

Beyond Hype: The Practical Evolution of Data Science with Generative AI

Video size:

Abstract

Summary

Transcript

Slides

Omer Ali Omer

Technology Degree Apprentice @ IBM

Syeda Narmeen

Data Scientist @ IBM

Join the community!

Featured event

2026

2025

Info

Conf42 Large Language Models (LLMs) 2025 - Online

March 20 2025 - premiere 5PM GMT

Beyond Hype: The Practical Evolution of Data Science with Generative AI

Video size:

Abstract

Summary

Transcript

Slides

Omer Ali Omer

Technology Degree Apprentice @ IBM

Syeda Narmeen

Data Scientist @ IBM

Join the community!