Data Privacy Alarm Bells Ringing for #GenAI users #AI

What we are seeing now with data privacy policy changes is a bait-n-switch. First, we’re promised our data is safe. Then, after we use the tool and find it indispensable, the policy is changed without consultation and can be consumed by Big Tech.

Think about how many teachers and professors are using consumer versions of Gen AI tools at work, processing, as one superintendent put it in a conversation, budget information, staff and student schedules with names left in. What happens to this data?

Jon Accarrino from MIT points out the issues:

The AI industry just pulled off the biggest privacy heist in tech history. 🤔

OpenAI, Google, and Anthropic all reversed user privacy protections in August… while you were planning Labor Day weekend. Now your AI chatbot conversations could become permanent training data and subject to law enforcement monitoring. 😨

If your company still allows employees to access free and consumer AI tools like ChatGPT, your most sensitive data could already be someone else’s competitive advantage. 😠

My latest article for TVNewsCheck breaks down what all media executives need to know: https://lnkd.in/en6HeNeT

My Notes

Some of my notes:

  • OpenAI, Google, Anthropic changed their data privacy policies
  • Sensitive data staff put into Gen AI tools now becomes “someone else’s competitive advantage”
  • How is your organization or company safeguarding sensitive data when employees are putting it into consumer Gen AI tools who adjusted their data privacy policies?

In his article, he identifies three general categories of Gen AI use. What’s frightening to me? I see this perspective mirroring one that educators and many in non-profit organizations are encouraging because it’s the way it’s always been done with ed tech of yesteryear. Here’s a revision for education perspective (see his original in the article):

  • Testing free AI platforms. Teachers put student records and budget data into unpaid ChatGPT versions. No rules exist. Data privacy agreements are non-existent and a data breach is right around the corner.
  • Restricted paid AI access for certain educators. This works for a few, but leaves an unknown number using unsafe consumer tools.
  • Protected district systems with tailored programs, safe student data storage, connection points, and IT oversight. Minimal danger.

New policies announced:

  • OpenAI (Aug. 26): Now monitors all ChatGPT conversations. Human reviewers can read flagged chats. Users may be reported to law enforcement.
  • Anthropic (Aug. 28): Reversed privacy stance. All conversations train AI models unless users opt out by Sept. 28. Data kept for 5 years instead of 30 days.
  • Google (Aug. 13): Auto-enables “Personal Content” feature. All Gemini chats train AI models by default starting Sept. 2. Users must manually disable.

You see the pattern, right? All three major AI providers have made the shift to automatic data collection with opt-out requirements rather than opt-in consent.

What’s the solution?

Schools, organizations need internal, local Gen AI solutions. That is, a Gen AI interface that uses frontier models (e.g. ChatGPT, Claude, Gemini) via API so that all data remains local on the user’s computer and the organization’s network. It could also mean running free, open source (open weight) Gen AI models internally.

But how to accomplish this? I’m going to say it again, BoodleBox Unlimited is the best $20 investment for data privacy protection. I’m not paid to say it and I do not gain personally or professionally from it. I know you are skeptical, and I was, as well.

Unless you can run local Gen AI on a powerful internal server, make it available to your people, you need a solution that has crystal clear, transparent data retention policies, data privacy agreements, deletes chats immediately (rather than stores them indefinitely), and access frontier models via an API.

Here’s an explanation of how it works, using BoodleBox as an example via Claude 4.1 Opus:

How BoodleBox Accesses Frontier Models via API:

When you type a message in BoodleBox, here’s what happens:

  1. Your message travels from BoodleBox’s interface to their servers. BoodleBox adds authentication credentials and formatting.
  2. BoodleBox sends your request to the AI provider’s API endpoint (like OpenAI, Anthropic, or Google). Think of it as BoodleBox knocking on the AI’s door with your question.
  3. The AI model processes your request on the provider’s servers. Your data never touches BoodleBox’s infrastructure for processing – they’re just the messenger.
  4. The response returns through the same path: AI provider → BoodleBox → your screen.

Key Privacy Points: • BoodleBox doesn’t store or train on your conversations • Each AI provider has separate data policies (as you noted earlier) • BoodleBox acts as a secure middleman, not a data collector • Your conversations with different bots stay isolated from each other

The Protection: BoodleBox shields you from direct exposure to consumer AI platforms. Your organization’s data stays within enterprise agreements, not personal accounts that feed AI training.

Or, if you would rather self-host your local Gen AI (expensive, difficult for enterprise use):

How Local AI Solutions Access Frontier Models via API:

When you type a message in a local AI platform, here’s what happens:

  1. Your message travels from the organization’s interface to their own servers. The local system adds authentication credentials and formatting.
  2. Your servers send the request to the AI provider’s API endpoint (like OpenAI, Anthropic, or Google). Think of it as your organization knocking on the AI’s door with your question.
  3. The AI model processes your request on the provider’s servers. Your data never touches the local platform’s external infrastructure for processing – your servers are just the messenger.
  4. The response returns through the same path: AI provider → your organization’s servers → your screen.

Key Privacy Points: • Your organization controls data storage and retention policies • Each AI provider has separate data policies (as you noted earlier) • Your local system acts as a secure gateway, not a data collector • Your conversations with different models stay within your infrastructure

The Protection: Local AI solutions shield you from direct exposure to consumer AI platforms. Your organization’s data stays within enterprise agreements and your own servers, not personal accounts that feed AI training.

Using free ChatGPT is like posting on social media. They own and learn from your content. Using APIs is like hiring a consultant under non-disclosure agreement (NDA)), as they process your request but can’t keep or learn from your data. You must verify your organization has enterprise/API agreements, not consumer accounts. Some organizations accidentally use consumer subscriptions thinking they’re protected.

In his article, Accarrino suggests several actions and responses to take. I shudder to think what these steps would yield for K-16 schools, non-profits, and small businesses. They might look like this, as inspired by Jon Accarrino’s two steps for Media companies:

Week 1: Discovery Phase

Day 1-2: Launch District-Wide Survey 
• Send Google Form to all staff asking: Which AI tools do you use? What student/district data have you uploaded?
• Include teachers, administrators, support staff, bus drivers, cafeteria workers

Day 3-4: Department Head Meetings 
• Meet with each department (Special Ed, Athletics, Counseling, IT, Finance)
• Document specific AI uses in each area
• Identify highest-risk data exposures

Day 5: Immediate Protection 
• Email all staff with privacy setting instructions
• Show how to opt out of AI training in ChatGPT, Claude, Gemini
• Mandate: No sharing AI chat links on social media or public sites

Weeks 2-4: Policy Development

Week 2: Create Policies 
• Draft “No Free AI Tools” policy for board approval
• Define what counts as “district business”
• List approved vs. prohibited AI platforms

Week 3: Evaluate Solutions 
• Request demos from enterprise AI vendors
• Compare costs for district-wide licenses
• Check FERPA/COPPA compliance documentation

Week 4: Training Rollout 
• Develop 30-minute mandatory training module
• Create quick reference guides for teachers
• Schedule department-specific training sessions

Given my own experience in a 10K student school district, simply getting approval to send out the survey would require Superintendent’s Cabinet level meeting, then a conversation with campus principals and district department leads. Simply, this schedule is too aggressive for schools. You can’t snap your finger to see this put in place. I still remember trying to get spreadsheets and student information encrypted while in transit and at rest.

Some people didn’t see the value, understand the nefarious players, and laughed it off until they mis-placed an iPad on a trip with loads of confidential, student data. Then, it was MY problem to solve. It could have been awful, but that one turned out OK. Others don’t, they simply get swept under the rug.

At some time in the future, I have an article appearing on Shadow AI in Schools. Now, I need to revisit and update it.

When’s the last time you revisited your Gen AI policy in schools?


Discover more from Another Think Coming

Subscribe to get the latest posts sent to your email.

3 comments

  1. if your school is using a workspace plus license your data does not assist the LLM for Gemini. This does bring up a great discussion though for those users that use tools that are consumer grade.

  2. That’s accurate, Justin. If you are using Google Workspace for Education, their use of any data that staff inadvertently add to a chat is NOT permitted under contract to be used. The same applies for Business Enterprise users. HOWEVER, the consumer version at $21.32 a month (see? I’ve paid for it that way and know the exact price with tax in Texas) DOES use anything you give it for training data.

    The only $20 alternative that is data privacy safe is BoodleBox Unlimited based on my research. They provide API access to the various chatbot models. That said, I would encourage you to use Education, Enterprise level tools with contractual protections (those are beyond individuals, however, because it’s your organization that has to enter into the agreement), GenAI tools that access consumer models via API, such as local Gen AI on your device or via cloud API like BoodleBox, MagicSchool, SchoolAI, Eduaide, etc.

    Hope this helps,
    Miguel Guhlin

Leave a reply to Justin M Cancel reply