Currently, 71% of countries have data privacy laws, and those laws are becoming increasingly complex and mature as regulatory bodies and government agencies increase their understanding of the value and risks of big data. Within this environment, protecting sensitive legal data has arguably never been more important.
What is often less visible is the potential risks that are inherently contained within the legal translation process, where translation technologies often create “data lakes” that may include all critical documents in a legal proceeding. Without effective policy and control, these documents may be distributed across multiple country borders in a bid to get the materials translated in a timely manner.
In an environment where data breaches can have devastating consequences, the translation process security policies and workflows must also be understood and reviewed as thoroughly as any other large-scale data repository outside of the legal practice firewall.
In this guide, our experts will share vital information on how data breaches can happen during the translation process and what you can do to protect your organization.
How Do Data Breaches Happen During the Translation Process?
Data breaches in the translation process can occur in several ways. One culprit is the use of free public translation tools and AI chatbots like Google or ChatGPT. These tools, while convenient, often store and potentially expose sensitive information, making that information findable online. Sometimes, this information can even be uncovered through a simple Google search, as Norwegian oil company Statoil discovered. With ChatGPT and similar models, there’s the potential for sensitive or protected information to show up in chats. Other potential causes include human error and security protocols that aren’t secure enough.
All of these can lead to data leaks, jeopardizing compliance with regulations like HIPAA in healthcare or GDPR in the EU.
Data Breach Prevention: Dos and Don’ts for Secure Translations
To protect the confidential data your organization is responsible for, it’s crucial to make sure everyone in your organization is following a few simple rules when translations are needed.
Do: Use secure, industry-compliant translation tools for any type of sensitive information.
If the information you’re translating contains protected data, then it’s critically important to use a secure translation tool. Before you use a machine translation tool (whether one you choose or one in use by your language solutions provider) understand the terms and conditions and how your translation data will be used and stored. There are dozens of commercial, private MT tools and each process and manage data a little differently.
Do: Disaggregate translation memories (TMs).
A TM tool is designed to reuse previous translations, saving a significant level of translation expense. However, the TM tool stores a lot of data relating to the source of the translations, and this data can be used to recreate any original source document that has been translated. Often the TM data, including the tracking information, is distributed across international borders increasing the risk this data ends up in a country with different protection or accessibility laws. Depending on the size and scope of the translation relationship the data could also end up in the hands of hundreds of untracked individual translators without effective policy and data transmission controls. Using a vendor, like ULG, that has TM disaggregation capabilities ensures that your documents cannot be recreated from the TM by any recipient. Making sure your vendor has a policy and process in place for this is a critical control requirement.
Don’t: Use public machine translation tools, like the free version of Google Translate or Bing Translate.
These can store sensitive data, creating vulnerabilities. When you click ‘accept’ on the terms and conditions for most free public translation tools, you’re agreeing to grant them a license to store the data you submit for translation and to use it to improve their services.
That can cause problems, as Norwegian oil company Statoil found out several years back. The company used a free translation tool available at Translate.com to translate sensitive internal documents. Some of that material, including passwords and emails with personal data, was later found publicly available on Google.
What about buzzy new artificial intelligence (AI) models like ChatGPT? These tools, powered by Large Language Models (LLMs), pose the same security concerns that free public MT tools do. Whatever you submit becomes part of the training data, and researchers have already been able to trick ChatGPT into disclosing private information contained in training data.
Do: Implement encryption for data in transit.
Whether you’re using MT, human translators, or a mix of both, make sure the data you submit is properly encrypted to keep it safe. Encryption converts your data into an unreadable format until it reaches its destination, preventing unauthorized access. If you’re sending files via email, encrypt the file. If you’re uploading files to a website or portal, be sure that the website or portal is secure.
Don’t: Neglect the importance of comprehensive data governance and control.
No matter how your organization is handling translations, it’s imperative to understand what is happening to your data at every stage of the process. Under some regulations (like the EU’s General Data Protection Regulation (GDPR), your organization remains accountable even if your data is breached while in the custody of a third-party data processor.
Do: Involve a translation partner that focuses on security
Partnering with a language solutions provider (LSP) that uses secure translation management software (TMS) and rigorous data security protocols is the best way to maintain control over sensitive data and protect your organization and your clients.
The Consequences of Translation Data Breaches
Data breaches in translation can lead to severe consequences:
- Legal repercussions, including lawsuits and significant fines for non-compliance.
- Damage to reputation, as trust is a cornerstone in client relationships.
- Loss of confidentiality, leading to potential identity theft or fraud.
- Erosion of competitive advantage.
- Diminished trust from clients and stakeholders.
Using unsecured tools is just like leaving your credit card on the tube in London. A huge mistake, and never worth the risk.
Real-World Breaches: Lessons Learned
These risks are not just theoretical. Let’s look at some real-world examples.
Translate.com’s Free Translation Is Too Good to Be True
As mentioned earlier, the Translate.com breach exposed highly sensitive user data. While Statoil was the largest organization affected, it was not limited to them. They were “one of the many” companies whose confidential documents Norwegian news site NRK was able to find freely available online.
These documents included (but were not limited to):
- A doctor’s email exchange with a global pharmaceutical company regarding taxes matters
- Staff performance reports
- Contracts
- Termination letters
Lesson Learned: Avoid free translation services for sensitive documents. As you craft your organization’s policy governing employee usage of these free tools, remember that if employees are allowed to use them at work, they may unintentionally submit sensitive information to determine a document’s content.
Walmart Canada Sued for Third-Party Breach
In another example, Walmart Canada faced a class-action lawsuit for failing to do its due diligence after its third-party photo processor, PNI Digital Media, was breached by hackers.
This breach illustrates what can happen when companies entrust sensitive data to third parties without verifying their security procedures. The financial impact was significant, as Walmart Canada and PNI suffered $1.5 million in financial losses. Walmart Canada also had to change how they handled all vendors, including language solutions providers, from that point on.
Lessons Learned: Verify that any third-party company that handles sensitive data is using the proper procedures, including strong encryption, well-trained staff and the appropriate security technology.
NYC Translation Company Exposes Client Documents
Our final example involves a small translation company based in NYC. This company allowed customers to upload documents for translation to an unsecured database. A total of 25,601 records, many of which contained personally identifiable information (PII), were publicly exposed until security researcher Jeremiah Fowler sounded the alarm earlier this year.
Lessons Learned: Before you upload anything sensitive for translation, make sure that it will remain secure.
How United Language Group Safeguards Data
At United Language Group, we take a comprehensive approach to data security:
- Our fully secure and GDPR-compliant TMS tool, Octave, provides data encryption in translation.
- Strong data governance and control mechanisms for protecting translated content. We uphold ISO 27001:2013 standards for information security management best practices as well as HITRUST CSF certification for data security and management.
- Data regions segregation to support specific legal needs across jurisdictions.
- Routine TM/data scraping practices minimize residual data risks for better translation data protection.
- A secure remote environment for translation provides an additional layer of security.
- Custom-built, secure machine translation (MT) services that your employees can use for quick translations without the security risks.
Take the First Step to Increased Data Security
Partnering with an LSP with strong security practices provides you with critical legal protection and peace of mind. Our team is ready to help you move forward with an efficient, high-quality and secure translation process. Contact us for a consultation to see how we can help.