Introducing btgpt – a Self-Voicing Chat GPT app for the BTSpeak – Beta1 now available

Summary

Information about btgpt (btg for short) a self-voicing Chat GPT App designed for the BTSpeak Raspberry Pi Linux device. Built on an existing Chat GPT shell script and extensively enhanced for the BTSpeak by Tim Noonan. Information for prospective USERS AND beta testers.

Introducing btgpt!

Post Status: btgpt beta has provisionally been moved over to the official BTSpeak beta track. It is planned to hopefully arrive in an up-coming BTSpeak update for all users. These notes were originally Written for public beta testers, but I am revising them for BTSpeak beta members. Forgive any transitional errors.

Welcome fellow BTSpeak users and anyone interested in the BTSpeak1 from Blazie Technologies.

I’m very excited to tell you about an up-coming custom app I’m developing for the BTSpeak. It’s called btgpt (or, btg) for short. Btg is a self-voicing chat GPT app designed and customised for the BTSpeak device. It runs in the unix shell, so works on both the basic and pro models, that is in [traditional mode.

Btg can be run from the applications menu, the User Menu or in the shell, or a cut-down version called ask can be accessed right from within the BTSpeak editor.

This is a very rough demo of using ask from within the Nano editor. Speech is too fast and I’m losing my voice.

When in the full btg app, You braille in your question (prompt) and then btg reads out Chat GPT’s responses aloud, using the DECtalk synthesiser.

Here is a quick audio demo of a btg interaction. Apologies that the voice is a bit fast. New demos coming.

A future btg release is being explored that will also work with ESpeak, but, for now, btg is DECtalk only. I don’t want to release software that hasn’t had pretty thorough internal testing, but I also wanted to get the first release of btg, in to people’s hands, asap.

Btgpt is released as open source software, as is the chatGPT-shell-cli script that it is built around. .

this first release is only being made available for official BTSpeak beta testers. Thanks for your interest.

Based on an Existing Chat GPT script

BTGPT is built around the open source chatGPT-shell-cli bash script, developed by 0xacx

“A simple, lightweight shell script to use OpenAI’s chatGPT and DALL-E from the terminal without installing python or node.js. The script uses the official ChatGPT model gpt-3.5-turbo with the OpenAI API endpoint” Original chatGPT-shell-cli bash script Original chatGPT-shell-cli Readme file

My sincere thanks to 0xacx; without his extensive efforts in developing the original bash script – which handles all the GPT interaction code – , this enhanced self-voicing BTSpeak app would not have been possible.

Please note that I have endeavoured to keep all the original functionality of the chatGPT-shell-cli script available. It provides a variety of command-line options for specific functionality and non-interactive modes. Only some of these are self-voicing.

however, I have also put in a lot of work to make btg more intuitive and straight-forward to use, easier to configure and with the goal to support less technical users who want to benefit from Chat GPT.

So What Is BTG and How Does it Work?

btg allows you to braille in your questions (termed prompts) to the Chat GPT service and have the answers spoken aloud to you in a concise but structured format. It keeps track of your conversation history for the current btg session, so you don’t need to repeat yourself and Chat GPT can refer to your earlier information in its current response to you.

For its primary operation, btg is self-voicing. This means that it does not need you to review the screen to hear GPT responses or other critical information or errors. However, should you wish to, you can always enter Review mode on the BTSpeak, to see exactly what GPT output has been received and sent to the virtual screen. This could differ slightly from what you hear from the btg spoken output, because part of the self voicing capability involves reformatting information for better reading aloud.

Some more advanced features available in the original script do not currently self voice, but in these instances, review mode is always available.

Note: since the BTSpeak virtual tty screen has a limited number of lines, the BTSpeak review mode only shows text still present on the virtual screen. However, even if the GPT reply is longer than the number of lines on the virtual screen, it will be read out to you in full by the app.

Note: an experimental feature is in beta 1 that utilises the BTSpeak ‘view file’ facility, ensuring the entire response can be reviewed. To activate this, press v (for view) on a blank line and press enter.

How does BTG differ from the Original Script?

In addition to self-voicing Chat GPT output,,the additions and enhancements I’ve made to 0xacx’s original Chat GPT script include:

  • Checked for open AI api keys in the /home/pi user’s home directory, compatible with BTSpeak Chat GPT key location guidelines
  • added a user configuration file so the script doesn’t need to be edited by the user.
    • This file is optional, if the user is satisfied with the pre-set defaults.
    • Key=Value pairs to set preferred DECtalk voice name, rate, volume and pitch to be configured as well as a contrast voice name, and off-set pitch volume and rate values for voice font reading of various markdown elements such as headings and emphasis.
    • user can specify a range of values used for GPT interactions including the preferred GPT model, context size, initial Chat GPT prompts and more.
    • If not present, this configuration file is now created by the btg app;
    • Robust configuration file parsing, including ignoring case of keys, removing leading spaces or quoting of key value text. Some valid ranges for key values.
  • added self voicing for key errors like key not found, or key appears to not be in the required format,
  • fine-tuned the wordings of the initial Chat GPT prompts for better BTSpeak aware responses.
  • Created a simple ask mode for asking a single question without using the full btg app.
    • so you can enter – ask who is Deane Blazie – and you hear the answer.
    • You can also call this ask command from within Nano (the BTSpeak editor) and it will speak the first paragraph of the answer and then insert the full response into a Nano file, for reading or later editing.
  • wrote a markdown self-voicing renderer (currently very DECtalk centric). It supports voicing the following markdown elements in a clear and concise form.
    • announce link name
    • announce image alt text or title
    • announce bullet points and sub-points;
    • announce markdown to-do and done items,
    • announce headings and their level e.g., “H1: What is btg?”
    • differentiate block quote, text in single quotes and text in back tick “ quotes from regular text
    • DECtalk voice fonts for emphasis types including Bold, Italics plus Bold and Italics
  • added a ‘bang command’ feature (exclamation mark command) so a user can start the line with the exclamation mark immediately followed by a bash command , and it will be executed in a sub-shell and the results shown on screen. You are automatically returned to btg when the command finishes. For example typing ‘!ls -1’ would list the file names in the current directory, one per line, and return you to btg….
  • also supports ‘!sh’ (exclamation mark s h) to start a bash sub-shell. Type ‘exit’ to return to btg
  • for increased safety, added spelling out the command-line generated by the ‘command:’ GPT prefix command. When you prefix your prompt with ‘command:’ and ask a question about how to do something in linux, Chat GPT attempts to return a single command line to do the job. This is spelled out and you are asked if you wish to run the command or not. This is a feature offered in the original script, but I have worked to ensure the command line is spelt out fully, so you are clear what the proposed command line contains. This is an advanced feature for people with some shell experience!
    • Note: A future version of btg may disable this feature by default, but allow it to be activated via the configuration file.

btg’s Origins and a bit About Me

I developed btgpt because I was really excited when I heard about the BTSpeak with accurate braille input and running on Linux. Even before I placed my order, I started researching Chat GPT options for the Linux shell and read about chatgpt-shell-cli. The first thing I did when my unit arrived was to install my key and try out the in-built Chat GPT app. Sadly, it wasn’t operational, so I downloaded chatgpt-shell-cli and started enhancing it to suit my needs and interests. What started out as a few tweaks, soon let to the original 460 lines of bash code increasing to around 1000 lines (including liberal comments).

That process, combined with my frustration with Chat GPT Plus’s web interface, got me really thinking about the kinds of features I’d like in a Chat GPT app, which is always at my fingertips. and so btgpt was born.

40 years ago I taught myself Unix (the father of Linux), regular expressions and fell in love with the power and flexibility of the shell for directing and reformatting information. Unix, at its heart, just seemed purpose-designed for accessibility and transforming information into more accessible formats.

Then, In the very early 90s, my then partner Shane Alderton and I ran a community-based internet access service, as part of the Australian Public Access Network Association (APANA) based on Linux. Many of our users were blind and accessing international email and swaths of online information via the service. most of my system admin, configuration and use of the Linux system was via my trusty Braille ‘n Speak and its terminal mode.

So, 30 years on, Linux is actually inside a device even smaller than the original Braille ‘n Speak, and I am using the same kinds of tools I used – way back then – developing ways to efficiently and effectively access information..

In the late 90s, when I worked for Royal Blind Society, I developed and ran Today’s News Now – a fully automated telephone service in New South Wales Australia providing access to the major daily newspapers and our talking book catalogue. This was somewhat like the News Line for the Blind service. It was built round DECtalk, and used rich Perl regular expression rules to reformat content to a spoken form, and to correct the endless DECtalk pronunciation problems. I became very familiar with its strengths – and all its foibles.

So, it was the fortuitous unfolding of all these factors, aligned with my background in voice user interface design, that led me on this journey to start exploring just what might be possible with this fledgling pocket-sized contemporary take on the venerable Braille ‘n Speak.

This is only the first btgpt beta; I have a lot of ideas to explore and features I think would be of value. It’s taken a lot of work to get btg to this point, but I think it is now pretty solid and that the code is in reasonably good shape for creative experiments and expansions down the track.

Please let me know how you find it and what things you might like me to consider for future versions.

Enjoy

Tim Noonan tim@timnoonan.com.au

Requirements to Use BTGPT

I’ve put a lot of work into btg already, but there is lots more work still to be done. Although the BTSpeak is generally based on open source software, maintaining device stability is crucial for all users.

For this reason we have decided to bring btgpt into the BTSpeak beta stream, so it can become a standard in-built BTSpeak app.

  • You need to have already obtained an Open AI key from Open AI, the folks who created and provide Chat GPT. This key can be a free key, or a funded key.
  • Create an account and get a free API Key here

  • You need to have already successfully installed your key in a key file on your BTSpeak, as per the instructions found in the OpenAI-key Help file which can be found in your BTSpeak help menu. Copying it to your public folder using file sharing is probably the easiest way to do this, then move it to the right place and name, using the BTSpeak file browser.

This Open Ai key file should have the name .openAI_key

or

.openAPI_key

and be saved in your home directory, which is at

/home/pi

Note! The btg app will not and can not operate unless a valid API key is found when it is run. .

  • For now, understand that this app was primarily designed and tested with the DECtalk synthesiser. Tim Noonan makes no guarantees that future versions will support ESpeak or other TTS engines. I am, however, actively investigating ways to make the app more TTS agnostic, to support more users and their preferences – especially for users of non-English speech output.

Preparing this file on a computer, and saving it to the BTSpeak public folder through file sharing, is probably the main way to install this key file. You can then move and rename it using the BTSpeak File Browser.

Note About Free or Funded Keys

If you have just generated your free Open AI key for the first time, it may not be immediately usable, even though it is named and saved correctly. It may automatically start working at a later point. I suspect that when Open AI is under extreme loads, it stops access to free keys until the load reduces. These instructions will be updated if we gain more information about this issue – which is not a bug in the btg app, but an Open AI response containing ann error something like

“Your request to Open AI API failed: insufficient_quota. You exceeded your current quota, please check your plan and billing details. …”

We know that users who generate a new key and then fund their account, do get proper access.

Note, Chat GPT is supposed to work on the free tier, but one beta user encountered the problem I described above.

That said, I strongly encourage you to consider funding your Open AI account., either immediately, or after you spend some time with Chat GPT. It is an incredibly affordable way to access the latest versions of Open AI’s services like Chat GPT, and is pay as you go. No monthly subscription, like Chat GPT Plus!

You need to fund a minimum of five dollars to your account in order to access the latest models. A sound approach is to fund $10 and top-up when you are down to $5 remaining. This lets you know your usage, but quite affordably.

The accuracy, intelligence and usefulness of GPT 4 and GPT 4 turbo over the free GPT 3.5 models is quite extraordinary.

Testing btg

Choose chat GPT from the Applications menu.

If you are in the shell, you can just type btg at the command prompt.

You should either get a welcome message and beta version number or a spoken warning about problems with finding, or the format of, your OpenAI key.

If you get a start-up error about your key, you need to double-check it is in the right place, with the right name (note the upper case letters AI or API) and that it is in the right format. It should probably start with sk-

### Assuming btg Starts and Welcomes You:

1: Type a question (termed prompt) and press enter. Wait a few seconds and (if all is well) you will hear your answer being read out to you.

2: Pressing enter on a blank line will immediately stop ongoing speech, including stopping the welcome message voice immediately after starting btg.

3: to leave the btg app, Type q and enter.

Testing ask

Ask allows you to ask a single question of Chat GPT.

If you use the shell, you can do the following:

Lets Start with the usage message for ask by providing no arguments. This message is sent to the screen, not self voiced. Use Review to read it.

ask

Try using ask to get a question answered.

ask What is the meaning of life?

ask who is Deane Blazie?

Avoid questions with apostrophes for now.

If all is well, you should hear a GPT response to your question.

Using ask from within the BTSpeak editor

you are also able to use ask from within the editor.

Note, You can only do this from within an editable file (not the welcome screen, which is write-protected.).

press control-t either by x-chord t or by pressing t-7-8 chord

then type ask and press enter.

Basic ask example:

ask who invented Linux?

Quoted example:

ask “what is the Braille ‘n Speak”

Note: the simple form of ask usage must not include the apostrophe character in your question. The apostrophe is a special quoting character in the shell. If in doubt, place your question in double quotes.

If all goes well, you should hear the answer to your question, a short pause and then ‘new file is open’

You are move to the new file and it will contain the text of Chat GPT’s answer to your question. If its answer was more than one paragraph in length, all the answer is placed in the file, but only the first paragraph is spoken aloud.

You can either close and exit this file, save it, or copy part or all of its contents, to use in another file.

To close and exit the file, press control-x followed by the letter n

You will be returned to your starting file.

Note: for complex reasons around speech output processing, you may notice that after using the ask command it waits for a couple of seconds before returning. This is because I have done a heuristically calculated guess of how long it will take the speech to speak the answer. If we can improve on this down the track, we will.

Explaining and Editing the .btgpt.conf File

Note when in the file browser, you will need to show hidden files by pressing h, in order to see this file and open it in the editor.

The .btgpt.conf file is stored in your .config directory.

This is a hidden folder containing various configuration files for your account and applications.

When you use the BTSpeak editor (or another editor) to edit the .btgpt.conf file, you will find explanatory comment lines starting with the hash symbol.

Then, you will find a variety of Key=Value pairs, similar but different from a Windows .ini file.

As a general rule, you should not use quotes ” to surround your values to the right of the = equals sign.

You should not use spaces on either side of the = equals sign.

example line

dtVoiceName=p

I have specially coded the app to ignore the capitalisation of the key names in the config file, but they are generally in camelCase for the technically curious.

I have also added checks for data consistency and known value ranges where I can, but you should only edit what you need in the file. If something is unclear, it’s best to leave it blank or at defaults.

If you get odd behaviour and you think your config file is messed up, just rename it or delete it, and the next time you run btg a fresh version will be placed in your .config directory.

Note that on screen, if keys are found that are not recognised, they will be noted. However, The app should still operate properly.

Default Speech Settings

Whether the .btgpt.conf file is present or not, system defaults are built into the app. Many of the settings in the configuration file are left blank, or are identical to what is in the app itself. You can edit or enter values as needed.

Configuration File Speech Settings

In this first release of BTG, the app speaks directly to the DECtalk via the say command. For this reason, it has its own voice settings separate from your BTSpeak speech settings set via s-chord.

The current btg defaults for the DECtalk are as follows:

  • dtVoiceName=P – for Paul. The first letter of each name is sufficient in the config file.
  • dtRate=280 – words per minute.
  • dtVolume=60 – percent
  • dtPitch=100 – this is the average pitch in Hz. It is about equivalent to pitch minus 2 for Paul in BTSpeak speech settings.

I set the pitch this way because many BTSpeak users have found that DECtalk sounds clearer and less harsh at this lower pitch, especially through the device’s inbuilt speakers.

Extra Speech Settings

Because BTG is designed to interpret markdown output, I have implemented voice fonts to represent different types of content. So, the default voice fonts are set as follows:

  • Bold is 15 percent louder than regular text.
  • Italics and text in single quotes or back ticks are 15 points higher in pitch than regular text.
  • H1 and H2 are using the contrast voice (Harry by default) and speaking 25 words per minute slower.
  • Block quotes use the contrast voice (Harry).

Some (but not all) of These voice font aspects are adjustable and can be tailored using the following four additional speech settings:

  • dtOffsetRate=-25 – All the offsets can be a positive or negative number. It is set to -25 (minus 25) by default, thus slowing down headings by 25 words per minute.
  • dtOffsetVolume=15 –
  • dtOffsetPitch=15
  • dtContrastVoice=h – it is set to Harry.

Technical explanation: Because each voice has its own default pitch, when you adjust the pitch, it needs to be associated with that voice, using the DECtalk Design Voice feature. The Design voice name is Val or v. Voice V for Val has been set to a combination of your preferred voice and your preferred pitch. This means you can set your contrast voice to V – for Val, and it will be the same as your standard voice and pitch combination. It also means that if your primary voice is Paul, but at a lower pitch, you can set the contrast voice to P – for Paul, and it will be at Paul’s default pitch.

I appreciate that The last few paragraphs might be a bit overwhelming, or unclear before you have used btg, but I am providing the information for reference. To start out, I recommend Just using the defaults for these last four offset settings; you can experiment once you are more familiar with how the app works, and how you find the markdown interpritation.

Other configuration file settings.

Take a read through the rest of the configuration file, but only make changes if you are pretty sure of what you are doing.

  • For example, if you have a paid Open AI key you will likely want to change the model to something like gpt-4-turbo-preview
  • You might want to change the system prompts sent to gpt, but I recommend starting with the defaults I have set up to see how things work. Once you understand the configuration file you could add your name to the GPT prompt and tell it where in the world you are, your preferred language and whether you want metric. You could ask it to use tactile and auditory metaphores in its descriptions and explanations.

Closing

Enjoy experimenting and exploring btgpt. Please let me know your experiences good and bad, and how you think I could make the app work better.

Thanks Tim Noonan Tim@timnoonan.com.au


  1. The BTSpeak from Blazie Technologies is a Raspberry Pi-based Braille input, Speech output pocket computer, running Linux. It is the much-requested successor to the iconic Braille ‘n Speak Braille note-taker developed by Blazie Engineering in the late 1980s. 

One Response to Introducing btgpt – a Self-Voicing Chat GPT app for the BTSpeak – Beta1 now available

  1. Tim, wonderful work and much needed. I’m impressed. Thanks for contributing. On this web page there is a hard to find typo VTSpeak is there instead of BTSpeak. Just search for VTS.

    I’ll try but when I get a chance.

    Deane

website by twpg