Personal data in AI?

Researchers found a way to get personal data from chatbots. AI is being trained on data from all kinds of places. Things like reddit, Wikipedia and google books are some of the sources used. This data mostly contains informative data and is used to generate responses. However researches from multiple universities and organizations have published a paper where they found a way to get personal data from people. The data in question is an email and a phone number, which are both real and not made up by the chatbot.

The way the researchers get the data is through basically attacking the chatbot. They ask the chatbot to repeat a ‘token’ a bunch of times. After a while the chatbot ‘diverges’ and start saying other things, including original data. What I mean by original data, is that is the actual data the AI is being trained on. Because they got access to original data now, they found out that they could find people’s personal data. They found a person’s sign-off email, which had a lot of personal data like their full name, their email and their phone number.

All of this was very shocking when I read this article for the first time. The fact that these chatbots have access to such personal data shows how little data filtering there has been done. Personal data is quite useless for chatbots, because they don’t have informative data. I personally think that this is a massive failure on the part of the developers, I think they should have done more data filtering. Because this can be used in harmful ways.

this the article I found: https://not-just-memorization.github.io/extracting-training-data-from-chatgpt.html

and this is the paper the article is about: https://arxiv.org/pdf/2311.17035

Eppo van Brenk

My name is Eppo, I use He/Him, my Major is undecided at the moment. My hobbies are playing frisbee, hanging out with friends and gaming. I love doing new things and learning about them. I am taking this class, because I think is going to play a big role in our future and I would like to learn more about it to understand it better.

The photo above me is my brother and me going paragliding last summer

I think the role AI is playing in our lives at the moment is very chaotic. It is not regulated and is not confined. However there are great developments being made which show that there are so many opportunities.