Security of Generative AI: Avoiding Microsoft Copilot Data Exposure

December 5, 2023

111

One of the most potent productivity tools on the planet is Microsoft Copilot, according to some reports.

Word, Excel, PowerPoint, Teams, Outlook, and other Microsoft 365 apps all come with an AI assistant called Copilot. Microsoft wants to free people from the mundane tasks of everyday life so they can concentrate on solving problems creatively.

The fact that Copilot has access to everything you’ve ever worked on in 365 sets it apart from ChatGPT and other AI tools. Instantaneously search and gather information from all of your documents, presentations, emails, calendar entries, notes, and contacts with Copilot.

And that’s where information security teams run into trouble. All of the sensitive data that a user has access to, which is frequently far too much, is accessible to Copilot. 10% of an organization’s M365 data is typically accessible to all staff members.

Additionally, Copilot is capable of quickly producing new, sensitive data that needs to be secured. Before the AI revolution, people were far more capable of creating and sharing data than they were of protecting it. Consider the trends in data breaches. Generative AI fuels this fire with kerosine.

When it comes to generative AI as a whole, there are many things to explore, such as deepfakes, model poisoning, and hallucinations. But in this article, I’ll pay particular attention to data security and how your group can guarantee a secure Copilot rollout.

Use cases for Microsoft 365 Copilot

With a collaboration suite such as M365, the applications of generative AI are practically endless. It’s understandable why a large number of IT and security teams are rushing to secure early access and are organizing their rollout strategies. The productivity increases will be substantial.

You can ask Copilot to create a proposal for a client, for instance, by opening a blank Word document and providing it with a target data set that includes PowerPoint decks, OneNote pages, and other office documents. It takes only a few seconds to have a complete proposal.

Here are a few more examples Microsoft gave during their launch event:

Copilot can attend your team meetings and record action items, provide a real-time summary of the topics covered, and identify any unanswered questions.

Outlook’s Copilot feature can assist you with email prioritization, inbox management, thread summarization, and reply generation.

Excel’s Copilot can analyze raw data and provide you with trends, insights, and recommendations.

The operation of Microsoft 365 Copilot

This is a brief summary of the steps involved in processing a Copilot prompt:

A prompt is entered by the user into a program like PowerPoint, Word, or Outlook.
Microsoft uses the user’s M365 permissions to determine their business context.
The LLM receives a prompt (similar to GPT4) in order to produce a response.
Microsoft carries out responsible AI post-processing checks.
Microsoft responds and gives the M365 app instructions again.

There is always a strong conflict between security and productivity when using Microsoft.

This was evident during the COVID-19 pandemic when IT teams hurriedly implemented Microsoft Teams without fully comprehending the underlying security model or the configuration of their organization’s M365 groups, permissions, and link policies.

Positive news

Tenant seclusion. Only information from the current user’s M365 tenant is used by Copilot. The AI tool won’t display information from any tenants to whom the user may be invited or from any tenants who may have cross-tenant sync enabled.

training limitations. Copilot trains its foundational LLMs for each tenant without using any of your business data. You shouldn’t be concerned that responses from other users in other tenants will contain your confidential information.

The unfortunate news

Authorizations. All organizational data that each user has the ability to view at least is made visible through Copilot.

Labels: The MPIP labels of the files that Copilot sourced its answer from will not be inherited by Copilot-generated content.

People. AI-generated content needs to be reviewed by humans; Copilot’s responses aren’t always safe or 100% factual.

Take each piece of bad news one at a time.

Authorizations

It would be a great idea to grant Copilot access to only what a user can access if businesses could simply enforce least privilege in Microsoft 365.

Microsoft states in its Copilot data security documentation:

“It’s important that you’re using the permission models available in Microsoft 365 services, such as SharePoint, to help ensure the right users or groups have the right access to the right content within your organization.”

Data, Security, and Privacy for Microsoft 365 Copilot

We know empirically, however, that most organizations are about as far from least privilege as they can be. Just take a look at some of the stats from Microsoft’s own State of Cloud Permissions Risk report.

This image is consistent with what Varonis observes during the thousands of Data Risk Assessments we conduct annually for businesses that use Microsoft 365. As stated in our report, The Great SaaS Data Exposure, We discovered the typical M365 renter possesses:

More than 40 million distinct permissions
113K+ private documents made available to the public
Over 27K sharing links

Why does this occur? Permissions for Microsoft 365 are very complicated. Just consider all the methods by which a user may obtain data:

User permissions directly
Group permissions for Microsoft 365
Local permissions for SharePoint (with custom levels)
Visitors’ access
Outside entry
Access by the public
Link access (direct, guest, org-wide, and anyone)

Permissions are mostly controlled by end users rather than IT or security teams, which exacerbates the situation.

Labels

Microsoft uses sensitivity labels extensively to implement encryption, enforce DLP policies, and stop data leaks in general. But in reality, it’s challenging to make labels function, particularly if you depend on people to apply sensitivity labels.

Microsoft presents labeling and blocking as the best security measure for your data. The situation in reality is less favorable. Labeling often falls behind or is out of date as human data is created.

Workflows may become more strained if data is blocked or encrypted, and labeling tools can only be used with particular kinds of files. An organization may become more confusing to users the more labels it has. This is particularly severe for bigger companies.

When AI generates orders of magnitude more data, label-based data protection will undoubtedly become less effective as accurate and automatically updating labels become more and more necessary.

Do my labels look correct?

Varonis can scan, find, and fix the following to validate and enhance an organization’s Microsoft sensitivity labeling:

private documents without a label
private documents with the wrong label
Sensitive labels on non-sensitive files

People

AI has the potential to make people lazy. The content produced by LLMs such as GPT4 is excellent. The speed and quality frequently far outpace human capabilities. People consequently begin to blindly believe that AI will produce responses that are accurate and safe.

Real-world examples of Copilot drafting a proposal for a client and adding private information from an entirely different client have already been observed. You have a privacy or data breach scenario when the user clicks “send” after giving it a cursory glance—or not at all.

Getting Copilot ready for tenant security

Understanding your data security posture prior to your Copilot rollout is essential. With the general release of Copilot, now is a great time to implement security controls.

With our Data Security Platform, which offers a real-time view of risk and the ability to automatically enforce least privilege, Varonis protects thousands of Microsoft 365 customers.

With essentially no manual labor, we can assist you in mitigating the largest security threats through Copilot. Varonis for Microsoft 365 allows you to:

Find and categorize every piece of sensitive AI-generated content automatically.
Make sure MPIP labels are applied correctly automatically.
enforce least privilege permissions automatically.
Continually keep an eye on sensitive data in M365, and react quickly to any unusual activity.

Security of Generative AI: Avoiding Microsoft Copilot Data Exposure

Related Articles

Ten AI Tools You’ve Probably Never Heard Of for Self-Improvement

Benefits Of WordPress Utilization For SEO

Google Case Study Shows API-Based Evolution of Search Console

LEAVE A REPLY Cancel reply

Latest Articles

Ten AI Tools You’ve Probably Never Heard Of for Self-Improvement

Benefits Of WordPress Utilization For SEO

Google Case Study Shows API-Based Evolution of Search Console

2024’s Top 8 Free Bootstrap Blog Templates

How Using SSPM, a $10 billion enterprise customer dramatically increased their SaaS security posture with a 201% return on investment