Structuring Your Knowledge Base - Detailed

To get the most out of your conversational AI, it's important to structure your knowledge base properly. Here are some common mistakes and best practices:

When you first start using the service, you'll receive a conversational AI with general world knowledge. To make it useful for your business, you'll need to add your specific knowledge to it.

Follow these best practices for optimal performance when building your text knowledge base in murai.

  1. Use clear wording, correct grammar, and avoid typos.
  2. Pay special attention to potentially confusing paragraphs.
  3. Include links within text and explain what they contain.
  4. Provide general information first, followed by details.
  5. Turn multiline lists into comma-separated ones.
  6. Use just plain text.
  7. Provide just answers; question and answer pairs are unnecessary.
  8. Provide context for each paragraph.
  9. Build tree like structure.

What is a Knowledge Base?

A knowledge base is a repository of useful data for your business. It is an essential tool for businesses that require quick and easy access to relevant information. To create a comprehensive knowledge base, aim for a length of up to 150 words, with each paragraph consisting of up to 8-10 sentences, or around 1000 characters. You can use various data structures such as notes, internal documentation, conversations, transcripts, articles, presentations, spreadsheets, and PDFs to achieve this goal.

Copying the Raw Text

To start creating your knowledge base, you can copy all the text from your FAQ pages into a single document. However, there are several issues with using raw text directly in your knowledge base. To address these issues, you need to follow these best practices:

1. Use Clear Wording and Correct Grammar

When writing your knowledge base, it's important to use clear wording, correct grammar, and avoid typos. Unclear wording, incorrect grammar, and typos can all make it harder for the AI to understand the points conveyed in your knowledge base. Imperfect translation may change the original meaning and cause the AI to answer incorrectly in conversations. To ensure optimal performance, it is advisable to create your knowledge base in English, although you may also use other languages. It is also possible for a single knowledge base to support multiple languages.

A typo can change the meaning of a sentence. For example, the sentence "I saw her duck" has a completely different meaning than "I saw her, Duck."

2. Avoid Confusing Paragraphs

Make sure your paragraphs are clear and logically organized. If the text is unclear or self-contradictory, the AI will not be able to use common sense to figure out what you meant. When in doubt, provide more explanation. For example, if the first sentence might be confusing, provide additional context to avoid misunderstandings.

Here is an example of a potentially confusing paragraph:

Make sure to follow the instructions for the installation of the software. If you run into any issues, contact customer support for assistance. The software is not designed to work on older operating systems.

This paragraph jumps from discussing installation instructions to contacting customer support to the software not working on older operating systems. To make it less confusing, it would be better to separate each topic into its own paragraph and provide additional context.

If you want to include links, it's important to explain what they contain. Instead of just pasting the link into the knowledge base, we recommend reformatting it into a single paragraph that explicitly points to the link as an additional resource.

For example, you can incorporate the question into the paragraph and rephrase the second sentence to point to a link included at the end. This way, the AI will be able to point to the link.

Here is an example:

Bad:

Q: How do I reset my password?

A: To reset your password, go to our password reset pageopen in new window and follow the instructions.

Good:

If you want to reset your password, you can access the detailed instructions at https://www.example.com/reset-password.

4. Provide General Information First

Another good practice is to provide general information first and only go into details as a follow-up. Try providing information in logical order - from general to more detailed.

Here is an example of a paragraph that provides complex information before general information:

To run the program, you must first install the dependencies. These include Python 3.6 or higher and some additional packages for data preprocessing. Once you have installed these dependencies, you can download the program files from our website and run the script to start the program.

This paragraph should be restructured to provide general information first, followed by details:

To run the program, you can download the files from our website and install the required dependencies. The dependencies include Python 3.6 or higher and some additional packages for data preprocessing. Once you have installed these dependencies, you can run the script to start the program.

5. Use Comma-Separated Lists

When formatting complex items for inclusion in the knowledge base, turn multiline lists into comma-separated ones. For example, if two bullet-point lists contain a list of steps to be completed on Internet Explorer or Microsoft Edge browsers, respectively, put them in separate paragraphs and turn multiline lists into comma-separated ones. Since those are separate paragraphs, it is also important to add some additional context.

Here is an example:

Bad:

Here is how to reset your password based on the browser you use:

Firefox: with 2FA, enter your email address, click "Submit," then follow the instructions in the email. You will be asked to enter a verification code sent to your registered phone number or email.

Chrome: enter your email on the login page and follow the instructions in the password reset link sent to your email.

Good:

To reset your password on Firefox browser with 2FA, enter your email address, click "Submit," then follow the instructions in the email. You will be asked to enter a verification code sent to your registered phone number or email.

To reset your password on Chrome browser, enter your email on the login page and follow the instructions in the password reset link sent to your email.

6. Use just plain text

In order for artificial intelligence to properly comprehend the context of a given text, it is necessary to format it into coherent sentences. This formatting process usually involves converting any graphs, pictures, or tables into logically structured sentences that can be easily interpreted by an AI system. Additionally, it may also be necessary to provide additional contextual information or to rephrase certain parts of the text in order to make it more comprehensible. This can involve anything from adding supplementary information to the text to breaking it down into smaller, more easily digestible chunks that are easier for an AI system to process.

Here is an example of a table:

Food you can purchase from us:

FruitPrice in EURIn Spanish
Apple123Manzana
Orange215Naranja

That we converted into sentences. One sentence for each row:

You can purchase from us Apples which is a fruit that costs 123 EUR and is translated to “Manzana” in Spanish.

You can purchase from us Oranges is a fruit that costs 215 EUR and is translated to “Naranja” in Spanish.

This is an example of just a very simple table, in reality these can often get a lot more complicated. To improve how the conversational AI understands very complex tables it is often smart to paraphrase what does that table contain. Additionally to the previous two paragraphs you can also include the following text:

You can purchase from us Apples and Oranges.

7. Provide Just Answers

When including knowledge items containing questions and answers, it's best to provide just answers. One common problem is that one paragraph may be an unanswered question which only gets answered in a different paragraph. However, simply removing the question leaves us with a Yes. as an answer that lacks proper context. That’s why we include the context from the deleted question by replacing Yes. with a full answer.

Here is an example:

Bad:

Q: Can I change my username?

A: Yes. It can be edited by going to personal profile and clicking the edit button.

Good:

You can change the username by going to personal profile and clicking the edit button.

8. Provide Context for Each Paragraph

When adding text to your knowledge base, especially if your source text consists of multiple connected paragraphs, make sure to provide the context in each paragraph.

Here is an example:

Machine learning is a subset of artificial intelligence that involves the construction of algorithms that can learn from and make predictions or decisions based on data.

One of the most common applications is in the field of image recognition. By training it on a set of images and their corresponding labels, it can learn to accurately classify new images that it has never seen before.

The context of the second paragraph is missing without the first paragraph. To resolve this issue, you can merge the paragraphs together, or if the text is too long, remove references to the first paragraph and provide more context to the second:

Machine learning is a subset of artificial intelligence that involves the construction of algorithms that can learn from and make predictions or decisions based on data.

One of the most common applications of machine learning is in the field of image recognition. By training a machine learning algorithm on a set of images and their corresponding labels, it can learn to accurately classify new images that it has never seen before.

9. Build tree like structure

Most of the documents are already split into sections and form a tree like structure that can be easily understood, but often the text we are adding to our knowledge base is longer than limited 1000 charterers. We cannot simply split the text in half, since than we would be missing the context in the second paragraph. In this case it is best to split the paragraph into multiple logical sections and create an additional connecting paragraph that is connecting those sections.

Here is an example how to deal with paragraphs longer than 1000 characters:

Standard equipment regarding safety of the Car 789 include: Anti-theft alarm system with interior monitoring in cab, Back up horn and towing protection, Central locking system with two remote control keys, system controls inside and SAFELOCK, Crosswind Assist, Driver Alert System, Driver and front passenger airbags with front passenger’s airbag deactivation button, eCall emergency system, Electric childproof lock, ESP, ABS, ASR, EDL, Hill Start Assist, Side and curtain airbags for driver and front passenger, First aid kit, warning triangle, 16-inch brake system, Cruise Control with speed limiter, Manual dimming breakaway interior rear view mirror, Separate daytime running lights (permanently switched on while driving), ISOFIX and top tether for rear seats, Heated washer nozzles, Single-tone horn, Start-stop system with regenerative braking, Three-point seat belts for all seats, height-adjustable with electric pre-tensioners on seats in cab, Seat belt warning for the driver's and front passenger's seat if unfastened, Washer fluid level indicator, Windscreen wiper intermittent control with four-speed control for windscreen wipers, Parking sensors, front Tyre pressure loss indicator and rear Tyre pressure loss indicator.

We split the text into 3 paragraphs. The first provides the context and connects the rest to the same topic. The two paragraphs also include the same wording that is used in the connecting paragraph to further emphasize on the connection between them.

Standard equipment regarding safety of the Car 789 can be split into 2 categories: main safety features and additional safety features.

Main safety features of Car 789 include: Anti-theft alarm system with interior monitoring in cab, Back up horn and towing protection, Central locking system with two remote control keys, system controls inside and SAFELOCK, Crosswind Assist, Driver Alert System, Driver and front passenger airbags with front passenger’s airbag deactivation button, eCall emergency system, Electric childproof lock, ESP, ABS, ASR, EDL, Hill Start Assist, Side and curtain airbags for driver and front passenger, First aid kit and warning triangle and 16-inch brake system.

Additional safety features of Car 789 include: Cruise Control with speed limiter, Manual dimming breakaway interior rear view mirror, Separate daytime running lights (permanently switched on while driving), ISOFIX and top tether for rear seats, Heated washer nozzles, Single-tone horn, Start-stop system with regenerative braking, Three-point seat belts for all seats, height-adjustable with electric pre-tensioners on seats in cab, Seat belt warning for the driver's and front passenger's seat if unfastened, Washer fluid level indicator, Windscreen wiper intermittent control with four-speed control for windscreen wipers, Parking sensors, front and rear Tyre pressure loss indicator.Those

Last Updated:
Contributors: Anze Mur