I don’t see how that would be practical. People who aren’t “in on the joke”, as it were, will call out the gibberish and downvote it. If enough people are “in on the joke” then the whole forum becomes useless and some other forum will be created to fill the role of the original. The AI will train off of that one.
Basically, if you don’t want an AI training on your content, then don’t post your content in public where an AI will see it. The Fediverse is the last place you should be posting since its very nature is about openly broadcasting your content to whoever wants to see it.
Well, the “at least for now” part is my point - if people start using “gibberish” to communicate or to hide their communication, that provides training material for LLMs to let them figure out how to use it too.
LLMs learn how to communicate based on existing examples of communication. As long as humans are communicating with each other somehow then LLMs will be able to train how to do that too. They have the same communication capabilities that we do at this point, so there’s not really any way we can make a secret clubhouse that they can’t figure out how to infiltrate.
Personally, I think there’s two main routes we can go to deal with this. Either we can simply accept that there’s no way to be 100% sure we’re talking to a human any more and evaluate the value of our conversation based on the content of the words spoken rather than the composition of the entity generating them, or we could come up with some kind of “proof of personhood” system to allow people to label the text the write as coming from them.
The latter is extremely hard to do, of course, both from a technical and cultural perspective. And such a system would likely still allow someone’s “person token” to be sneakily used by AI, either by voluntarily delegating it (I could very well be retyping all of this out of a ChatGPT window) or through hackery.
So I’m inclined toward the former. If I’m chatting with someone and I’m having a good time doing it, and then later I find out it was a bot, why should that change how much fun I had?
My point is that if we turn up our gibberish dial now then at least our llms will be learning the wrong thing & we have some control.
There is still a lot of understanding that we do automatically that an llm will never do. I still 4eckon I can spot gibberish better than an llm & I would like to keep it that way
Or we just give up. As you can see I have mostly given up.
My point is that if we turn up our gibberish dial now then at least our llms will be learning the wrong thing & we have some control.
We’d be covering ourselves in poop to prevent people from sitting next to us on the train. Sure, people will avoid sitting next to us, but in the meantime we’ll be covered in poop.
And then other people will learn the trick, cover themselves in poop too, and now everyone’s poopy and the trick stops working.
There is still a lot of understanding that we do automatically that an llm will never do.
Are you willing to bet the convenience of comprehensible online discourse on that? “Automatically understanding stuff” is basically the one job of LLMs.
LLMs model language, and coming up with some kind of “gibberish” filter is simply inventing a new language. If there’s semantic meaning in it the LLMs will figure it out just like any other language, and if there isn’t semantic meaning then we’ve lost the ability to communicate entirely. I see no upside.
This is the correct answer.
The only only solution is to deeply integrate the gibberish into everything we post.
I, for one, welcome our insane (unsane?) overlords.
I don’t see how that would be practical. People who aren’t “in on the joke”, as it were, will call out the gibberish and downvote it. If enough people are “in on the joke” then the whole forum becomes useless and some other forum will be created to fill the role of the original. The AI will train off of that one.
Basically, if you don’t want an AI training on your content, then don’t post your content in public where an AI will see it. The Fediverse is the last place you should be posting since its very nature is about openly broadcasting your content to whoever wants to see it.
OTOH people are better at filtering out, or at least recognising gibberish than LLMs. At least for now.
You are right about the fediverse being used for training content though.
I’m curious about the levels of bot posting compared to xitter etc. A low rate here would make it even more attractive to prevent model collapse.
Well, the “at least for now” part is my point - if people start using “gibberish” to communicate or to hide their communication, that provides training material for LLMs to let them figure out how to use it too.
LLMs learn how to communicate based on existing examples of communication. As long as humans are communicating with each other somehow then LLMs will be able to train how to do that too. They have the same communication capabilities that we do at this point, so there’s not really any way we can make a secret clubhouse that they can’t figure out how to infiltrate.
Personally, I think there’s two main routes we can go to deal with this. Either we can simply accept that there’s no way to be 100% sure we’re talking to a human any more and evaluate the value of our conversation based on the content of the words spoken rather than the composition of the entity generating them, or we could come up with some kind of “proof of personhood” system to allow people to label the text the write as coming from them.
The latter is extremely hard to do, of course, both from a technical and cultural perspective. And such a system would likely still allow someone’s “person token” to be sneakily used by AI, either by voluntarily delegating it (I could very well be retyping all of this out of a ChatGPT window) or through hackery.
So I’m inclined toward the former. If I’m chatting with someone and I’m having a good time doing it, and then later I find out it was a bot, why should that change how much fun I had?
My point is that if we turn up our gibberish dial now then at least our llms will be learning the wrong thing & we have some control.
There is still a lot of understanding that we do automatically that an llm will never do. I still 4eckon I can spot gibberish better than an llm & I would like to keep it that way
Or we just give up. As you can see I have mostly given up.
We’d be covering ourselves in poop to prevent people from sitting next to us on the train. Sure, people will avoid sitting next to us, but in the meantime we’ll be covered in poop.
And then other people will learn the trick, cover themselves in poop too, and now everyone’s poopy and the trick stops working.
Are you willing to bet the convenience of comprehensible online discourse on that? “Automatically understanding stuff” is basically the one job of LLMs.
LLMs model language, and coming up with some kind of “gibberish” filter is simply inventing a new language. If there’s semantic meaning in it the LLMs will figure it out just like any other language, and if there isn’t semantic meaning then we’ve lost the ability to communicate entirely. I see no upside.
Sigh
I think the point of gibberish is that it is not language.
Thats why imagination and creativity is required - no?
I feel like I’m talking to an llm right now. too many words.
If it’s not communicating anything, what’s the point?