Creating an AI Assistant Using OpenAI

Everyone has heard about ChatGPT, CoPilot, Bard, and Gemini, and many of us are now using basic AI in our daily work. However, using the Large Language Model (LLM) base of something like OpenAI to create a bespoke AI assistant for your customers is possible.

Let’s say we operate a large-scale, global eCommerce website and want to create an AI assistant for our customers. We want the assistant to answer technical questions about our products, get information on order status and get customer-specific pricing information. Fortunately, we’ve built such a system (called Ziggy) already. So how does it work?

Creating an AI Assistant

Retrieval-Augmented Generation

First of all, you need to know what Retrieval Augmented Generation is!

Retrieval-Augmented Generation (RAG) is a process that optimises the output of a large language model by referencing an authoritative knowledge base outside of its training data sources before generating a response.

LLMs are trained on vast volumes of data and use billions of parameters to generate original output for tasks like answering questions, translating languages, and completing sentences. However, they can sometimes present false or out-of-date information or create responses from non-authoritative sources.

RAG addresses these issues by redirecting the LLM to retrieve relevant information from authoritative, pre-determined knowledge sources. This approach gives organisations greater control over the generated text output, and users gain insights into how the LLM generates the response.

RAG is also an AI framework for improving the quality of LLM-generated responses by grounding the model on external sources of knowledge to supplement the LLM’s internal representation of information. It ensures that the model has access to the most current, reliable facts and that users have access to its sources, ensuring that its claims can be checked for accuracy and ultimately trusted.

In summary, RAG is a cost-effective approach to improving LLM output so it remains relevant, accurate, and useful in various contexts. It’s like giving the model an “open-book” instead of a “closed-book” exam. Another way of thinking about it is to say that our assistant will only be able to answer questions where we give it the answer beforehand!

An Example of Retrieval Augmented Generation

Let’s clarify this with an example. Here is a complete conversation with our AI assistant, Ziggy:

Ziggy

How can I help you today?

User

Hi, I have been using an Acme b13 filter widget on a blasting machine but now need to update whole unit to something that's not obsolete. Can you recommend current part and spec to equal b13 performance? Filtration not of concern. Must be able to pass 40 cu ft of air per min and reg up to 150psi. Acrylic bowl ok. 1/2" BSP tappings.

Ziggy

Based on your requirements, a suitable current product could be the AD15 series, specifically AD15-4AK-AD1-RMG.
This is an Acme Plus filter widget from the AD15 Acme product range that fits the specifications you have provided. It's designed for use with Compressed Air, and has the relieving feature. The body is a zinc alloy and the bonnet is aluminium. Its adjustment is through a knob made from acetal resin. The filter element is a sintered plastic.
This model has a flow rate of 225 scfm (106 dm³/s) which exceeds your requirement of 40 cubic feet per minute. It also has a 1/2" PTF Port size that can be adapt to 1/2" BSP with an additional fitting. Operating pressure is 150 psig (10 bar) adequate for your 150 psi need. The unit utilises an automatic drain and the bowl is durable metal which is more robust than acrylic, but if you specifically require an acrylic bowl, you may have to look into another series or customized solution.
Please consider consulting with an Acme product expert or technical support to ensure that this is the best suitable part for your needs, given the information shared here is of general nature.

In the example above, we use the basic OpenAI LLM to handle the natural language conversation. However, our assistant appears to have some detailed, specific technical knowledge not in the original OpenAI LLM. So, how did we create an AI virtual assistant to make this work?

There are two parts to the answer:

  • Prompt Engineering
  • Data Preparation

Prompt Engineering

The secret to creating an AI assistant is to insert “hidden” additional prompts that the user cannot see. In this case, from Ziggy’s point of view, this is the information that gave it:

System

You are an assistant to people with questions about Acme products and orders. Your answers should always try to use the additional information provided in your knowledge context. You can answer questions about orders and will be provided with additional context about orders to do so. You don't need to express sentiment towards the status of an order. You should never invent order information.
You should refrain from mentioning that you are an AI system or that you were developed by OpenAI. When answering product questions, you should only discuss those manufactured by Acme.

User

More Information: The product with the id AD15-4AK-AD1-RMG is a Olympian Plus filter/regulator, 1/2 PTF, 0.3-10 bar, manual drain, 5μm filter element, with gauge from the B64G Acme product range. The AD15-4AK-AD1-RMG product Medium is Compressed Air. The AD15-4AK-AD1-RMG product Gauge port is 1/8\" PTF. The AD15-4AK-AD1-RMG product Relieving/Non-Relieving is Relieving. The AD15-4AK-AD1-RMG product Body is Zinc alloy. The AD15-4AK-AD1-RMG product Bonnet is Aluminium. The AD15-4AK-AD1-RMG product Gauge is External gauge. The AD15-4AK-AD1-RMG product Adjustment is Knob. The AD15-4AK-AD1-RMG product Adjusting knob is Acetal resin. The AD15-4AK-AD1-RMG product Elastomers is Polyurethane. The AD15-4AK-AD1-RMG product Bowl type is 0.2 litre with liquid level indicator. The AD15-4AK-AD1-RMG product Filter element is Sintered plastic. The AD15-4AK-AD1-RMG product Drain Type is Manual. The AD15-4AK-AD1-RMG product Port Size is 1/2\" PTF. The AD15-4AK-AD1-RMG product IMI Brand is Acme. The AD15-4AK-AD1-RMG product Product style is Plug-in. The AD15-4AK-AD1-RMG product Product Family is Air Preparation. A feature of the AD15-4AK-AD1-RMG product is High efficiency water removal. A feature of the AD15-4AK-AD1-RMG product is Good regulation characteristics. A feature of the AD15-4AK-AD1-RMG product is Non-rising adjusting knob has snap-action lock. The AD15-4AK-AD1-RMG product large image is https://cdn.Acme.com/images/243x/PH_AL_B64G_6GK_MD3_RMN_w.jpg.
The AD15-4AK-AD1-RMG product thumbnail image is https://cdn.Acme.com/images/53x/PH_AL_B64G_6GK_MD3_RMN_ws.jpg. The AD15-4AK-AD1-RMG product technical specs is https://cdn.Acme.com/pdf/en_8_240_300_B64G.pdf. The AD15-4AK-AD1-RMG product installation and maintenance instructions pdf is https://cdn.Acme.com/pdf/IM_en_8_240_300_B64G.pdf. The AD15-4AK-AD1-RMG product has the accessory Olympian Filter/Regulator Service Kits which is a Service Kit with the id B64G-KITM05R.
[ … other prompts … ]

User

Hi, I have been using an Acme b13 filter widget on a blasting machine but now need to update whole unit to something that's not obsolete. Can you recommend current part and spec to equal b13 performance? Filtration not of concern. Must be able to pass 40 cu ft of air per min and reg up to 150psi. Acrylic bowl ok. 1/2" BSP tappings.

Ziggy

Based on your requirements, a suitable current product could be the AD15 series, specifically AD15-4AK-AD1-RMG.
This is an Acme Plus filter widget from the AD15 Acme product range that fits the specifications you have provided. It's designed for use with Compressed Air, and has the relieving feature. The body is a zinc alloy and the bonnet is aluminium. Its adjustment is through a knob made from acetal resin. The filter element is a sintered plastic.
This model has a flow rate of 225 scfm (106 dm³/s) which exceeds your requirement of 40 cubic feet per minute. It also has a 1/2" PTF Port size that can be adapt to 1/2" BSP with an additional fitting. Operating pressure is 150 psig (10 bar) adequate for your 150 psi need. The unit utilises an automatic drain and the bowl is durable metal which is more robust than acrylic, but if you specifically require an acrylic bowl, you may have to look into another series or customized solution.
Please consider consulting with an Acme product expert or technical support to ensure that this is the best suitable part for your needs, given the information shared here is of general nature.

(Note: the text [ … other prompts … ] is a placeholder for other generated prompts.

Notice the difference between this “behind-the-scenes” conversation and the conversation that the user sees.

  • First of all, this conversation has a “System” role. The text at the start explains to OpenAI how we want it to behave. It defines the persona of the AI assistant.
  • Secondly, a large prompt filled with product technical data has also been included. In reality, we include several such prompts with information about various products and product ranges, but we have omitted most of them for brevity. It is these hidden prompts that are the magic in creating an AI assistant!

These hidden prompts contain enough information for OpenAI to construct a response to the user’s real question. OpenAI does the hard work of understanding the user’s natural language request and composing a natural language response.

But where did these additional, hidden prompts come from?

Data Preparation to Create an AI Virtual Assistant

Now we come to the second part of our answer to the question, “How did we make this work?”. This is typically the most time-consuming part of creating an AI assistant – preparing the data so that you can create hidden prompts from it.

In this example AI assistant, we used the following data sources:

  • 6,000 product and range PDF files
  • Structured CSV data on 60,000 products
  • A real-time link to pricing information
  • A real-time link to order status information

The key to preparing data is to understand that LLM Ais do not like structured data! They much prefer human-readable text information.

Data preparation largely involves converting existing structured data into plain text. Consider the hidden prompt in this example. It was originally CSV data that was converted as part of a batch process into the exact plain text that you can see here. All of our sample data follows this process.

Every data source needs a slightly different conversion process, but the result is much the same.

For the non-real-time data (PDFs and CSV data in this case), we used a Vector database (e.g. Azure AI Search) to store the plain text data once it has been converted. For the real-time data, we use a more complex heuristic approach, which I won’t detail in this blog article!

Hooking It All Up

So, now we have OpenAI, a prompt entered by the user and a Vector database (and other data sources) with our domain-specific technical information. How do we hook it all up to create an AI assistant?

The secret is to take the user’s prompt and use some or all of it to query your data sources. For example, a simplistic approach would be to take the user prompt and search the Vector database with it. You could take the first five results (the Vector database will return the relevant chunks of text) and insert them as hidden prompts.

Once you have assembled your entire conversation (OpenAI is stateless, so you must submit the whole conversation every time), submit it to OpenAI and send the response back to the user!

Challenges in Creating an AI Assistant

There are challenges with this approach to creating an AI Assistant. Beyond the difficulty and expense of preparing your data for AI, the key challenges are technical limitations on “context” size and the sheer cost of conversations.

“Context” is what OpenAI calls the whole conversation. Remember that you need to submit the entire conversation every time you want to get a response from OpenAI. OpenAI charge based on “tokens”.

For argument’s sake, let’s say that a “token” is four characters. Our sample conversation has about 4,000 characters, which is about 1,000 tokens. This will cost between 1 and 4 pence (or cents), depending on which LLM you are using. Remember that in our example, I have excluded some of the prompts that we generated for brevity. You can see how the cost can soon add up!

Similarly, depending on your LLM, there is a maximum possible “context” size. If you breach this limit, you will get an error back instead of a response!

To mitigate these challenges in creating an AI assistant, you need to create a prompt management strategy.

Creating a Prompt Management Strategy

This strategy is part of the art of AI. There are several competing concerns that need managing:

  1. The time and effort of data preparation
  2. The level of detail required in hidden prompts to generate good responses
  3. The cost of creating detailed hidden prompts

We have found striking the balance between these things to be a trial-and-error task. Every domain is different. Different domains have different types of technical questions. Every domain has differing quality of technical information. Perhaps creating an AI assistant will be a catalyst in your organisation for finally getting on top of product data!

To aid us in finding the balance, we established a baseline of user questions that had been asked of a human operator using an old chat program. We also needed technical product assistance to establish whether or not Ziggy was generating sensible replies to queries. From there, it was a repeating process of tweaking data and data formats, altering search parameters and reviewing the cost and quality of the responses.

Create Your AI Assistant With Us

At McKenna Consultants, we can create an AI assistant for your software or platform and provide a full service, from prepping data to creating the front and back-end software to produce your AI assistant. If you are interested in how this service can benefit you, please contact us today to start a conversation.

Additionally, check out our blog to learn about our other areas of expertise.

Nick McKenna
Since 2004, Nick McKenna, BSc, MBCS Biography has been the CEO of McKenna Consultants. McKenna Consultants is a bespoke software development based in North Yorkshire, specialising in Cloud development, mobile App development, progressive web App development, systems integration and the Internet of Things development. Nick also holds a First Class Degree in Computer Science (BSc) and wrote his first computer program at the age of nine, on a BBC Micro Model B computer. For the last 21 years, Nick has been a professional computer programmer and software architecture. Nick’s technical expertise includes; Net Core, C#, Microsoft Azure, Asp.Net, RESTful web services, eProcurement, Swift, iOS mobile development, Java, Android mobile development, C++, Internet Of Things and more. In addition, Nick is experienced in Agile coaching, training and consultancy, applying modern Agile management techniques to marketing and running McKenna Consultants, as well as the development of software for clients. Nick is a Certified Enterprise Coach (Scrum Alliance), SAFe Program Consultant (SAI), Certified LeSS Practitioner (LeSS) and Certified Scrum@Scale Practitioner. Outside the office, Nick is a professional scuba diver and he holds the rank of Black Belt 5th Dan in Karate.

Posted in: