Active Content: What It Is & How It Becomes Malicious

April 5, 2021

Modern business processes rely on sharing information across the cloud. Your users are emailing documents and uploading files to shared drives as part of their jobs. Ensuring safe, secure, clean files is now mission-critical. 

Employees know that they shouldn’t open attached files when they don’t know the person sending them, but phishing emails aren’t the only risk. Downloading an innocent-looking PDF from the internet can also create risk of malware infection.

Understanding what active content is and how it becomes malicious can help you better manage security and mitigate risk. 

What is Active Content?

Active content is code hidden in documents and websites that enhance the end-user experience. Active content in files includes macros, add-ins, and OLE files. On a website, active content can be design elements, like GIFs, short videos, or drop-down boxes. It can also enhance productivity, like spreadsheet macros and predictive text. 

To help further break down active content, let’s look at some additional examples:

Active content in websites

Web developers often use active content to make their sites more appealing. Think about the websites that get your attention. Many of them have embellishments like short animated videos or interactive polls. You might also have been told to disable media players like Flash Player or Lightning Media Player.  

Some additional examples of website active content include:

  • JavaScript
  • Browser plugin
  • Web application add-ons
  • Countdown clocks
  • Toolbars
  • Cookies
  • PHP hypertext preprocessor

Active content in files

In files, active content looks a little bit different. This active content usually intends to make working with the file easier or faster. For example, an Excel macro lets you automate an action or series of actions in a spreadsheet. By automating these redundant tasks, you can finish a project faster. 

Some additional examples of active content in files include:

  • Add-ins
  • Data connections
  • Color-theme files
  • Links to external pictures
  • Real-time data servers
  • Smart documents
  • Cascading style sheet (CSS) files
  • Linked object linking and embedded (OLE) files
  • XML expansion packs
  • XML manifests
  • ActiveX controls

How Active Content Can Become Malicious 

While active content can be useful, cybercriminals often manipulate it to launch attacks. After all, active content is nothing more than code. When malicious actors can change this code, they can make their malware run, or execute, without the user doing anything other than opening the document or browsing the website. 

For example, web applications “talk” to the servers that store data. A user enters the web application’s URL, that browser talks to the web application, the web application then talks to the server to retrieve information. When the browser-based application changes the code into data readable by the user, it renders the data. If the cybercriminals have changed the active content, which is how that browser and server know how to make the data readable to humans, then they can change how the two digital entities communicate. 

The same principle works with documents that have active content. If the malicious actors input malicious code into an Excel spreadsheets macro, then that code can be triggered to run the malware as soon as a user opens the file. 

Examples of Harmful Active Content

Understanding some examples of harmful active content can help you better figure out how to protect your environment. 

File Upload Threats

Every file uploaded to your network poses a risk. According to OWASP, the following two classes of problems exist. 

Metadata

First, every file contains metadata, including information like:

  • filename 
  • path
  • author
  • date created
  • date modified
  • file size

When you’re uploading a file to a cloud location, this information is stored in the code and the URL. For example:

www.docs.filesharelocation.com/document/thisisnormallytotallyrandomnumbersandletters

Any change to the metadata can impact where the file goes. So, a malicious criminal using the URL attached to the file by placing code in a document can lead to a data breach.   

File size or content

Similarly, malicious actors can manipulate file size or content to change how the file interacts with your application or server. 

Some web applications use the “content-type” entity in the header to recognize a file as valid. For example, they might only accept files that list the content type as “text/plain.” Malicious actors can manipulate this code, changing what the application will accept. This lets them deliver malware to your network through the uploaded document. 

Portable Document File (PDF) Risks

While PDF isn’t a programming language, the code that specifies how a page should look can be manipulated. PDFs are a hierarchy of objects, meaning that they include a series of attributes that tell the document reader how they should look. The application used to read the document must render each element in a specific order. 

The National Institute of Standards and Technology (NIST) Special Publication (SP) 800-28 “Guidelines on Active Content and Mobile Code” details how PDFs can be used by cybercriminals, explaining: 

Full-featured PDF tools, such as Adobe Acrobat, may be more susceptible to attack by virtue of their extended functionality.  In the past, a demonstration showed that while the format itself may be benign, a PDF file could bear malicious code as an embedded file attachment [Fis01].  When the contaminated PDF file is opened for the demonstration, a game is launched that prompts the user to click on a moving image of a peach.  The occurrence of that event, in turn, causes the execution of an embedded VBScript file, which attempts to mail out the PDF file to others using Microsoft Outlook.

Because the PDF file has so many coded elements, malicious actors can hide malware inside the file. That malware then automatically runs when the user opens the file. This sets off an automated series of actions that the user can’t control. 

Protecting Your Organization From Malicious Active Content

One way to protect your organization from malicious active content is to use a content threat removal tool. Other methods include:

Content disarm and reconstruction (CDR) 

CDR tools act as proactive threat prevention that scrub malicious active content from files sent by email or other channels. 

How it works

CDR tools remove some or all active content from a file. You create policies defining the file components that you don’t want coming in from outside your organization. 

The CDR tool looks at all the metadata contained in the file and compares those definitions to the file type standards and policies you created. Then, it removes anything that could be a security risk, even if it isn’t. 

For example, they can remove metadata like file size, path, or macros. Then, once they’ve removed all the potentially risky elements, they rebuild the file. 

Why you need more than traditional CDR

CDR does a great job removing active content risks. However, it does this by removing the active content. Unfortunately, the active content often serves a purpose. Sometimes these tools block files when they know that there are active elements in them, preventing users from getting important business information that they need to do their jobs. 

For example, a PDF’s active content is what keeps formatting, like fonts, consistent across different devices and readers. This is different from a Word document that requires the reader to have the same version or installed fonts as the person sending it. 

Once the CDR removes the active content, it reconstructs the business data as a PDF, without any of the functionalities that make PDFs useful. 

This leads to end-user productivity problems including:

  • Removing the ability to make further edits 
  • Changing how text is processed
  • Removing fields that need to be filled in
  • Removing active content that is important, such as macros in an Excel document

Anti-virus software

Most organizations use some form of anti-virus software to protect their user devices from malware and ransomware

How it works

Anti-virus software tools use a large database of signatures or known malicious code. They scan devices for these signatures, then quarantine the potential malware or ransomware. They serve a useful cybersecurity purpose by helping to secure endpoints and every security program should use one. 

Why you need more

Cybercriminals continuously change their malware and ransomware codes. This means that the anti-virus software providers need to continuously update their databases. Some anti-virus tools use artificial intelligence (AI) or machine learning (ML) to help “guess” the newest signatures. Although these advanced algorithms work well, they’re not foolproof. 

Additionally, if you can’t enforce updates on your employees’ devices, your antivirus may not work as well as it should. 

Votiro Will Never Flatten Active Content

Votiro’s Positive Selection® technology is the next generation of CDR, giving you all the benefits of content disarm and reconstruction without the end-user hassle. Instead of just removing the negative, we focus on maintaining the positive, including the file’s integrity. 

We never flatten content. We take out all the mission-critical elements of the file, including content, templates, and objects. Then, our solution processes any embedded object and analyzes the content directory. Then, we rebuild the content with safe file templates. We do this with all files, including those embedded within emails and those downloaded from the internet. And it’s all done in microseconds. 

We bring our deep knowledge of file composition to ensure security and prevent weaponization. By focusing on the safe elements you need in a file, we reduce the end-user frustration while providing you instant access to clean, secure documents.