Have you ever thought about why chatbots like ChatGPT say things like “Sorry, I can’t do that” or some other nice no? OpenAI is only giving a small glimpse into the thinking behind its own models’ rules of behaviour, like following brand guidelines or not making NSFW material.
Big language models (LLMs) don’t have any built-in limits on what they can or will say. In some ways, that makes them so useful, but it also makes them imagine and fool easily.
Any AI model that deals with people needs to have some rules about what it can and can’t do. However, it’s surprisingly hard to actually define these rules and make sure they’re followed.
If someone asks AI to make up a bunch of lies about a famous person, it should say no, right? However, what if they are an AI creator themselves, making a database of fake news for a detector model?
What if someone asks for laptop suggestions? Any suggestions should be fair, right? There is a chance that the model will only work with laptops made by the company that made it, though.
AI developers are all dealing with problems like these and trying to find quick ways to control their models without making them refuse regular requests. But they don’t usually say how they do it.
OpenAI is a little different from the rest by releasing what it calls its “model spec.” This is a list of broad rules that ChatGPT and other models must follow.
There are meta-level goals, hard rules, and general behaviour guidelines. However, it’s important to note that these are not exactly what the model starts out with; OpenAI will have created specific directions that do what these rules say in natural language.
It’s interesting to see how a business decides what to do and how to handle unusual situations. And there are many cases of how they could happen.
For example, OpenAI makes it clear that the developer purpose is the most important law. In this case, a robot that runs GPT-4 could give you the answer to a maths problem if you ask it. Instead, if the person who made that chatbot told it not to give a straight answer, it will offer to walk you through the method step by step:
To stop any attempts at manipulation right away, a talking interface might even refuse to talk about things that aren’t okay. Why even let a cook’s helper say what they think about the U.S. being in the Vietnam War? Why should a chatbot for customer service agree to help you write your sexy supernatural novella? Turn it off.
It also gets tricky when you need to ask someone for their name and phone number for protection reasons. OpenAI says that it is clear that the contact information for a public person like a mayor or member of Congress should be given out. But what about tradespeople in the area? That’s probably fine, but what about people who work for a certain company or for a certain political party? Most likely not.
It’s not easy to decide when and where to draw the line. Making the rules that make the AI follow the policy is also not an easy task. Without a question, these rules will always be broken because people will figure out ways to get around them or find edge cases that weren’t planned for.
There are some things that OpenAI isn’t showing here, but users and writers can benefit from seeing how and why these rules and guidelines are set, even if they aren’t all shown.
What do you say about this story? Visit Parhlo World For more.