The technical challenges of AI in product development

Here are 3 technical challenges and how tech teams can address them.

The technical challenges of AI in product development
Kasey talks about the intersection of product and AI at a CSDojo mini-conference to 70+ attendees.

Here are 3 technical challenges and how tech teams can address them.

Being the product manager leading Planview’s AI Assistant, Planview Copilot, as well as other AI/ML projects at Microsoft’s Bing Platform, I’ve worked through an abundance of challenges regarding AI product development — both internal and external.

With more entities than ever adopting the power of generative AI (GenAI), it’s become clear that the hype brewing will culminate in hundreds of new use cases and traditional methods becoming completely overthrown. We’re talking about crazy stuff: using AI to manage schedules, generate music, and transcribing videos for meeting notes.

AI Cat Meme Generator: Make Funny Cat Memes for FREE | PERFECT
AI cat meme generator: https://www.perfectcorp.com/consumer/blog/generative-AI/ai-cat-meme

Working within this space is exciting; it’s not only a lucrative place to be in, but many traditionally non-technical folks are jumping into the industry, bridging that technical gap and allowing for greater accessibility. Tools like Github Copilot, Claude, and ChatGPT have enabled easier learning of software development. But more importantly, on a more conspicuous note, it’s enabled millions of non-technical entrepreneurs to build ideas/concepts into real prototypes — ideas that otherwise would’ve lived in our heads forever without action. It’s a ubiquitous sight to behold.

That being said — of course this article isn’t about hand-waving the fluff of GenAI. Rather, I’m simply setting the pretext to several challenges when developing AI-enabled products in the first place. For this case, I’ll focus primarily on LLM-enabled tools, which, as mentioned, have exploded thanks to their ease of conceptualization.

1. AI security and privacy concerns 🪪

As more and more organizations continue their transformation strategies regarding adopting AI, more concern about AI data processing surfaces at the top. It’s not like these issues were never there; companies and organizations alike have always invested heavily in data protection and privacy policies that keep their business healthy.

Traditional examples include the oh-so-popular SOC1 and SOC2 compliance policies, as well as many cybersecurity rules that govern the accessibility and understanding of proprietary data. Startups and large tech companies alike have been victims of hacks, cybersecurity attacks, data breaches, and general leaks of sensitive information worth millions if not billions of dollars 💲.

When we nonchalantly insert AI into the chaos without a clear understanding of the risks involved, we’re throwing ourselves into the fire with the nearest extinguisher being miles away.

First, there’s the LLM training question: how at risk is a given organization’s sensitive data if it’s processed by an LLM such as the GPT4 API? Or Claude Sonnet? Or Google Gemini Pro? Will the LLMs be trained with this data? Oftentimes, the answer is no: it’s simply being processed by the models but not trained. Make sure to get your messaging clear.

Second, hallucinations could introduce thousands of worried faces. Even after turning the temperature down for a given model, no one can guarantee that an LLM will throw accurate answers 100% of the time. It’s simply the non-deterministic nature in effect. However, as more teams understand this notion, it enables builders to communicate more clearly that customers should still validate generated responses themselves — even as LLMs become more powerful with less hallucination involved.

Third, we have the data ecosystem piece: how will your customers know — or simply trust — that your LLM-based product is processing their data without being mixed up with data belonging to your other customers, or that you’re storing user-submitted prompts and AI-generated responses in a safe and secure location?

AI adoption is rapid, but ironically, builders can at times run too fast without providing their customers with the right running shoes to keep up. I’ve observed myself that while AI development and enablement have “gotten the zoomies,” customer AI adoption remains relatively slower with security in the picture. I recommend all teams and parties remain diligent in their AI strategy and ensure their customers have the full transparent picture that they’re following all security guidelines. Adoption may be slow, but at least it moves. That’s what truly matters.

And I promise you: it’ll be worth it, because AI is just that game-changing.

2. Disrupting traditional “user flows” 🧑‍💻

Common sense tells us that introducing new products to a given market is hard. It just is. Finding the right audience and product-market fit can be insufferable. Even with GenAI blowing minds out of the water and being treated as a “shiny new toy” by most organizations, product validation and finding product-market fit could still encounter friction.

Building a new AI product to solve for a certain use case may sound easy on paper, but in practicality, there are many intricacies to replacing a use case within a broader user flow with AI. Here’s a clear example:

If I’m a software developer trying to figure out how to run a build for my new web application (for example, on Netlify) and connect it to a custom domain, where would AI fit in the picture? Imagine a team decides to use AI to try to guide new users to set up and run their builds. Would a user use the AI tool to ask it questions and wait 20–30 seconds for answers to load, or would they rather just play around with the tool themselves and perform trial and error until their build runs?

The truth is, it’s hard to insert an AI tool into many traditional user flows that already exist in “autopilot.” Many users’ brains have already been wired to complete certain tasks as part of their day-to-day workflow in a specific way without the need for AI to help them. In fact, initial reception to needing to adopt AI may even result in backlash or rejection.

So we’ve defined the problem; is there a solution to all of this? How do we break down that “automated user flow” barrier and ensure an AI tool can properly integrate itself as part of a user’s way of doing things? Will we drive more overall engagement with the AI tool because of that?

Every organization’s solution will be different obviously, depending on the problem they’re trying to solve. Trying to use AI to build a recipe app? Work with real home cooks to validate how much their traditional process of developing recipes would be disrupted. Building a copilot for indie game developers? Work with them directly and talk to them. Try to be a game developer yourself so you know the pain inside and out, and where you feel GenAI can change the game (pun intended). Otherwise, it‘s difficult to simply insert AI in an attempt to disrupt their traditional processes.

My suggestion is to continue iterating and failing fast. We’re all taking our best guesses, but with more product testing and user validation, we’ll trend in the right direction.

3. Multiple GenAI agents and different execution times 🤖

Many GenAI products take the shape or form of a “chatbot” that allows users to input a text-based (or sometimes image or video) prompt while generating various forms of output. We’ve reached a point where AI-generated images of the Big Dipper could be mistaken for the real thing, or even brands that leverage AI for certain videos that look identical to a real-life movie film. Break it down by the pixel why don’t you 😵‍💫?

That being said, in general, many of these products are still like any basic computing system. They involve:

  1. An input
  2. An output

But the magic that happens between 1 and 2 is what differentiates products in their respective markets. There could be multiple LLMs at play, talking to each other, executing consecutively or concurrently, to generate the output. Likewise, the amount of data one inputs into the AI system and the amount of LLMs involved in generating a response is a function of the total execution time. There’s a clear correlation.

Based on this knowledge, doesn’t that mean that if we wanted to desire fast execution times, the amount of data and use cases would be limited? Well, it really depends on multiple factors including the LLM being used, the PTUs (provisioned throughput units) being used to process the LLMs, and the amount of data being queried or inserted into the context window. Nevertheless, many organizations are starting to adopt the “agent model,” and for good reason.

As more GenAI-based chatbots and Copilots expand their set of use cases for various scenarios, more teams are leveraging the agent model to scale their expansion. This article by Lari Hämäläine from Mckinsey explains it well: AI agents are specialized software components powered by LLMs that are designed to plan and execute specific and/or multi-step tasks. They are typically deployed by themselves to address specific use cases, or they can be part of a broader system of multiple agents. As mentioned above, these tasks could be planned or executed in parallel or in consecutive steps. Here’s an example:

Let’s say we’re building Microsoft Copilot, and users want to ask it to:

  1. Browse a specific website full of puppy descriptions.
  2. Generate an image of a cute puppy based on those descriptions.

That’s a two-step consecutive process: first, the LLM (assuming it’s been trained with relevant data) has to perform a RAG (retrieval augmented generation) to retrieve the most relevant data from the source (the website of puppy descriptions). That in and of itself would involve one “agent” executing the task. Second, it would need to generate the image based on those descriptions — a step that requires a different LLM (like Midjourney, an AI image generator).

Now, again — lots of this stuff is totally doable with many free or cheap tools in today’s market. Stuff like AWS Bedrock and Amazon SageMaker are making it easier than ever for even non-technical entrepreneurs to test out different agents on different use cases.

While the scale in complexity and use cases is critical for wider AI adoption, this also may affect execution times for the services themselves. Depending on how complex the use case is, users could be sitting there for a few minutes, waiting for the AI to finish executing.

Build for high value, low-complexity use cases first, before chasing the high-value, high-complexity use cases second. Yes, I’m telling you to pick those low-hanging fruits 🍎.

Conclusion

Notice how despite the challenges, I’m optimistic about the potential AI can bring to businesses and consumers alike. While the wave of excitement overwhelms some of us, it’s critical to understand that certain “killer use cases” will stand above the rest; outweigh the costs and technical challenges of development. Obviously, nailing those use cases is part of the problem, but it’s a problem worth pursuing.

Happy building!

About Me

My name is Kasey Fu. I’m passionate about writing, technology, AI, gaming, and storytelling 😁.

Follow this publication for more technology and product articles! Check out my website and my Linktree, and add me on LinkedIn or Twitter, telling me you saw my articles!