AI Chat Bot with voice commands using ChatGPT

Introduction

In this project, our goal was to build a system allowing users to search for information using voice commands. We decided to integrate ChatGPT, a large language model developed by OpenAI, to interpret and process the voice commands. In this blog post, we will walk through the process of setting up ChatGPT, using it for voice command search, and the results we achieved.

Demo Link

https://chatlink.vercel.app/

Setting up the fronted

We have used basic HTML and vanilla javascript in the frontend.

In the index.html file, we have just used an input field and a chat container for the results.

<body>
    <div id="app">
      <div id="chat_container"></div>
      <form>
        <textarea name="prompt" id="pmpt" rows="1" cols="1" placeholder="Ask Chatlink by typing or voice"></textarea>
        <button type="button" id="record"><img src="assets/record.png" alt="send" /></button>
        <button type="submit"><img src="assets/send.svg" alt="send" />
      </form>
    </div>
    <script type="module" src="script.js"></script>
  </body>

In the script file, we have used some function loader, typeText, chatStripe, record, and handleSubmit
The loader function sets the text content of an element to an empty string and starts a timer that alternates between adding a period "." to the element's text content and resetting the text content to an empty string every 300 milliseconds, creating a loading indicator effect. The timer can be stopped by calling the "clearInterval" function with the "loadInterval" variable as an argument.

let loadInterval
function loader(element) {
  element.textContent = ''
  loadInterval = setInterval(() => {
    element.textContent += '.'
    if (element.textContent === '....') {
      element.textContent = ''
    }
  }, 300)
}

The typeText creates a typing effect by updating an element's innerHTML with a string of text, one character at a time, at an interval of 20 milliseconds. When all of the characters in the text have been typed, the interval timer is cleared.

function typeText(element, text) {
  let i = 0
  const interval = setInterval(() => {
    if (i < text.length) {
      element.innerHTML += text.charAt(i)
      i++
    } else {
      clearInterval(interval)
    }
  }, 20)
}

The chatStripe function generates HTML code for a chat message stripe. The "wrapper" element's class is set to "ai" if the message is from an AI, and the "message" element includes the message text and a unique identifier as its id attribute. The "profile" element includes an image element with a source attribute that is set to either a user image or a bot image.

function chatStripe(isAi, value, uniqueId) {
  return `
        <div class="wrapper ${isAi && 'ai'}">
            <div class="chat">
                <div class="profile">
                    <img
                      src=${isAi ? bot : user}
                      alt="${isAi ? 'bot' : 'user'}"
                    />
                 </div>
                <div class="message" id=${uniqueId}>${value}</div>
            </div>
        </div>
    `
}

The handleSubmit is an event handler for a form submission event. It creates a FormData object from the form, adds a chat message stripe to a chat container element with the form data as the message text, resets the form, generates a unique identifier, adds another chat message stripe to the chat container with the unique identifier as its id attribute, scrolls the chat container to the bottom, selects the chat message element with the unique identifier, calls a function called "loader" with the element as an argument, sends a POST request to a specified URL with the form data as the request body, and handles the response depending on whether it is successful or not.

const handleSubmit = async (e) => {
  e.preventDefault()
  const data = new FormData(form)
  chatContainer.innerHTML += chatStripe(false, data.get('prompt'))
  form.reset()
  const uniqueId = generateUniqueId()
  chatContainer.innerHTML += chatStripe(true, ' ', uniqueId)
  chatContainer.scrollTop = chatContainer.scrollHeight
  const messageDiv = document.getElementById(uniqueId)
  loader(messageDiv)
  const response = await fetch('https://chatlink.herokuapp.com/', {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      prompt: data.get('prompt'),
    }),
  })
  clearInterval(loadInterval)
  messageDiv.innerHTML = ''
  if (response.ok) {
    const data = await response.json()
    const parsedData = data.bot.trim()
    typeText(messageDiv, parsedData)
  } else {
    const error = await response.text()
    messageDiv.innerHTML = 'something went wrong'
    alert(error)
  }
}

The record function uses the Web Speech API to enable voice recognition. It generates a unique identifier, adds a chat message stripe to a chat container element, selects the chat message element, calls a function called "loader" with the element as an argument, and starts a new speech recognition process. When the recognition process produces a result, the function updates a form field with the transcript, stops the recognition process, removes the chat message stripe, and calls a function called "handleSubmit" with an event object as an argument.

function record() {
  const uniqueId = generateUniqueId()
  chatContainer.innerHTML += chatStripe(false, ' ', uniqueId)
  chatContainer.scrollTop = chatContainer.scrollHeight
  const messageDiv = document.getElementById(uniqueId)
  loader(messageDiv)
  var recognition = new webkitSpeechRecognition()
  recognition.continuous = false
  recognition.interimResults = false
  recognition.lang = 'en-US'
  recognition.start()
  recognition.onresult = function (e) {
    console.log(e.results[0][0].transcript)
    document.getElementById('pmpt').value = e.results[0][0].transcript
    recognition.stop()
    document.getElementById(uniqueId).parentElement.parentElement.remove()
    handleSubmit(e)
  }
}

Backend and Integration with ChatGPT

First, we need an openAI account. Sign in to openai then click on the profile icon to get API keys.
For the configuration, code go to Playground on openai website and then choose text-davinci-003, which is the best for text and code prompts. You can also choose other models based on the requirements. Then click on view code to get the code
Make a basic node server using express and then create a post request which will look like the below code

app.post('/', async (req, res) => {
  try {
    const prompt = req.body.prompt
    const response = await openai.createCompletion({
      model: 'text-davinci-003',
      prompt: `${prompt}`,
      temperature: 0.3,
      max_tokens: 3000,
      top_p: 1,
      frequency_penalty: 0.5,
      presence_penalty: 0,
    })
    res.status(200).send({
      bot: response.data.choices[0].text,
    })
  } catch (err) {
    res.status(500).send({
      error: err,
    })
  }
})

This function is a route handler for an HTTP POST request in an Express.js app. When the route is accessed, it retrieves the "prompt" field from the request body and makes a request to the OpenAI API using the "createCompletion" method, passing in a set of options as an object. It sends a response with a status code of 200 and a JSON object with the bot's response as the "bot" field.

Results and Conclusion

Overall, we were pleased with the results we achieved using ChatGPT in our voice command search system. It was able to accurately interpret and respond to a wide range of voice commands, improving the user experience. In conclusion, ChatGPT proved to be a powerful and effective tool for interpreting and processing voice commands in our project. We look forward to continuing to explore the capabilities of this impressive language model in the future. For the fun part, the content of this blog is generated using this Chat Bot only.