Marian Ganišin
Posted on August 11, 2024
Keeping sensitive data secure and private is a top priority in software development. Application logs, one of the common leakage vectors, are carefully guarded to prevent the presence of secrets. The same concern and risk apply also to testing logs, which can reveal passwords or access tokens. Tools running CI workflows usually provide mechanisms to mask sensitive data in logs with little to no effort. While this is very convenient, efficient and easy to use, in certain situations, this may not be sufficient.
Why CI Workflow Masking Alone Might Not Be Enough
For instance, GitHub Actions does a good job of handling secrets. Any secret defined within the workflow is automatically masked from the captured output, which works like a charm. However, like any CI system, it has its limitations. If the output report takes a different path—such as being saved to a file, junit is generated or sent to a remote log store—GitHub Actions doesn't have the ability to inspect the content and mask the secrets.
Moreover, test won't always run within a CI workflow, and even then, secrets may still need to be hidden. Imagine you're running tests locally and share a log to discuss an issue. Without realizing it, you include a URL with your access token.
Therefore, having a mechanism to handle sensitive data in test logs is essential at all levels. The best approach is to implement this directly at the test level or within the test framework itself. This ensures that secrets are not leaked from the primary source, preventing them from being passed up through the system.
Adding Protection at the Right Level
Maintaining the masking of secrets directly in tests can be relatively costly and error-prone, and often feels like a losing battle. For example, imagine you need to design a URL with a token as a parameter. This URL must be rendered differently for use in a request compared to its presence in the log.
In contrast, intercepting the report generation within the test framework provides an ideal opportunity to hook into the process and modify the records to eliminate sensitive data. This approach is transparent to the tests, requires no modifications to the test code, and functions just like the secret-masking feature in a CI workflows—simply run it and forget about managing the secrets. It automates the process, ensuring that sensitive data is protected without adding extra complexity to the test setup.
This is exactly what pytest-mask-secrets does, obviously when pytest is in use for test execution. Among its many features, pytest offers a rich and flexible plugin system. For this purpose, it allows you to hook into the process just before any log is generated, at a point when all the data has already been collected. This makes it easy to search for and remove sensitive values from the records before they are outputted.
Putting It to the Test: A Practical Demo
To illustrate how this works, a simple example will be most effective. Below is a trivial test that may not represent a real-world testing scenario but serves the purpose of demonstrating pytest-mask-secrets
quite well.
import logging
import os
def test_password_length():
password = os.environ["PASSWORD"]
logging.info("Tested password: %s", password)
assert len(password) > 18
In this example, there’s an assertion that has the potential to fail (and it will), along with a log message that includes a secret. Yes, it might seem silly to include a secret in log, but consider a scenario where you have a URL with a token as a parameter, and detailed debug logging is enabled. In such cases, libraries like requests
might inadvertently log the secret in this way.
Now for the testing. First, set the secret needed for testing purposes:
(venv) $ export PASSWORD="TOP-SECRET"
Next, run the test:
(venv) $ pytest --log-level=info test.py
============================= test session starts ==============================
platform linux -- Python 3.12.4, pytest-8.3.2, pluggy-1.5.0
rootdir: /tmp/tmp.AvZtz7nHZS
collected 1 item
test.py F [100%]
=================================== FAILURES ===================================
_____________________________ test_password_length _____________________________
def test_password_length():
password = os.environ["PASSWORD"]
logging.info("Tested password: %s", password)
> assert len(password) > 18
E AssertionError: assert 10 > 18
E + where 10 = len('TOP-SECRET')
test.py:8: AssertionError
------------------------------ Captured log call -------------------------------
INFO root:test.py:7 Tested password: TOP-SECRET
=========================== short test summary info ============================
FAILED test.py::test_password_length - AssertionError: assert 10 > 18
============================== 1 failed in 0.03s ===============================
By default, the secret value appears twice in the output: once in the captured log message and again in the failed assertion.
But what if pytest-mask-secrets
is installed?
(venv) $ pip install pytest-mask-secrets
And configured accordingly. It needs to know list of environment variables that hold the secrets. This is done by setting MASK_SECRETS
variable:
(venv) $ export MASK_SECRETS=PASSWORD
Now, rerun the test:
(venv) $ pytest --log-level=info test.py
============================= test session starts ==============================
platform linux -- Python 3.12.4, pytest-8.3.2, pluggy-1.5.0
rootdir: /tmp/tmp.AvZtz7nHZS
plugins: mask-secrets-1.2.0
collected 1 item
test.py F [100%]
=================================== FAILURES ===================================
_____________________________ test_password_length _____________________________
def test_password_length():
password = os.environ["PASSWORD"]
logging.info("Tested password: %s", password)
> assert len(password) > 18
E AssertionError: assert 10 > 18
E + where 10 = len('*****')
test.py:8: AssertionError
------------------------------ Captured log call -------------------------------
INFO root:test.py:7 Tested password: *****
=========================== short test summary info ============================
FAILED test.py::test_password_length - AssertionError: assert 10 > 18
============================== 1 failed in 0.02s ===============================
Now, instead of the secret value, asterisks appear wherever the secret would have been printed. The job is done, and the test report is now free of sensitive data.
Closing thoughts
From the example, it might seem that pytest-mask-secrets
doesn't do much more than what GitHub Actions already does by default, making the effort appear redundant. However, as mentioned earlier, CI worklfow execution tools only mask secrets in captured output, leaving JUnit files and other reports unaltered. Without pytest-mask-secrets
, sensitive data could still be exposed in these files—this applies to any report generated by pytest. On the other hand, pytest-mask-secrets
does not mask direct output when the log_cli
option is used, so the masking features of CI workflows are still useful. It’s often best to use both tools in conjunction to ensure the protection of sensitive data.
This is it. Thank you for taking the time to read this post. I hope it provided valuable insights into using pytest-mask-secrets
to enhance the security of your testing process.
Happy Testing!
Posted on August 11, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.