This week was focused on building an OpenAI API for text generation and integration with other project components. OpenAI’s text generation capabilities are extensive and offer a range of input modalities and model parameterization. In their tutorials and documentation, OpenAI “recommend[s] first attempting to get good results with prompt engineering, prompt chaining (breaking complex tasks into multiple prompts), and function calling.” To do so, I started by asking ChatGPT to “translate” ASL given syntactically ASL sentences. ChatGPT had a lot of success doing this correctly given isolates phrases without much context or prompting. I then went to describe my prompt as GPT will be used in the context of our project as follows:
I will give you sequential single word inputs. Your objective is to correctly interpret a sequence of signed ASL words in accordance with ASL grammar and syntax rules and construct an appropriate english sentence translation. You can expect Subject, Verb, Object word order and topic-comment sentence structure. Upon receiving a new word input, you should respond with your best approximation of a complete English sentence using the words you’ve been given so far. Can you do that?
It worked semi successfully but ChatGPT had substantial difficulty moving away from the “conversational” context it’s expected to function within.
So far, I’ve used ChatGPT-4 and some personal credits to work in the OpenAI API playground but would like to limit personal spending on LLM building and fine-tuning going forward. I would like to use credits towards gpt-3.5-turbo since fine-tuning for GPT-4 is in an experimental access program and for the scope of our project, I expect gpt-3.5-turbo to be a robust model for in terms of accurate translation results.
from openai import OpenAI client = OpenAI()
behavior = "You are an assistant that receives word by word inputs and interprets them in accordance with ASL grammatical syntax. Once a complete sentence can be formed, that sentence should be sent to the user as a response. If more than 10 seconds have passed since the last word, you should send a response to the user with the current sentence." response = client.chat.completions.create( model="gpt-3.5-turbo", messages=[ {"role": "system", "content": behavior}, {"role": "user", "content": "last"}, {"role": "user", "content": "year"}, {"role": "user", "content": "me"}, {"role": "user", "content": "went"}, {"role": "user", "content": "Spain"}, {"role": "assistant", "content": "I went to Spain a year ago."}, {"role": "user", "content": "where"} ])
The system message helps set the behavior of the assistant.
eventually, we would like to use an Openai text generation model such that we provide word inputs as a stream and only receive text in complete English sentences.
SVO (Subject, Verb, Object) is the most common sentence word order in ASL and Object, Subject, Verb (OSV) necessarily uses non-manual features (facial expression) to introduce the object of a sentence as the topic. As a result, we will restrict the scope of our current LLM to successfully interpreting SVO sign order.