Discovery | Comparison Testing of JSON APIs

rorywarin

Rory Warin

Posted on June 6, 2023

Discovery | Comparison Testing of JSON APIs

At Bloomreach, quality is a primary focus for releases. Due to the parallel development of many features, existing unit and integration testings were not sufficient. So comparison testing of APIs is added in the release process, where the production environment is replicated, a new release is deployed on it and APIs are tested. This helps in predicting the behavior of new release in production and ensures the quality and stability of the release.

With Bloomreach serving more than 300 customers supporting about 1600 QPS for Search and Personalization APIs, 3000 QPS for Suggest APIs and more during the holiday period, we cannot afford to break our serving by deploying a buggy release. To ensure the quality of our serving, the engineering and QA teams do comparison testing of these APIs by randomly selecting frequent URLs across all the merchants and testing the API layer with our new release candidate by hitting two endpoints. One endpoint is the production environment serving our customers and the other one is a replication of our production environment having the new release. We then collect the responses from both the endpoints and generate a report showing differences in the JSON responses. The below figure illustrates this mechanism and each step is elaborated in further sections:

Comparison testing pipeline

Selection of random URLs

We randomly pick one-tenth of the requests from our production log in the last 24 hours and generate unique valid URLs. Selected URLs are based on the presence of our URL parameters, like sort, facet range, and filter query (fq), etc. Weights are assigned to these parameters and a weighted average is taken, where a request with greater weight is given preference. This increases the coverage of API features. Below is a code snippet for calculating weighted average:

def _clean_up(queries):
   """Remove debug and callback params"""
   for key in ('unwanted_param_1', 'unwanted_param_2', 'unwanted_param_3'):
       queries.pop(key, None)
   return queries


def _weight(queries):
   """Returns the weight of queries"""
   weights = {
       'q': 1,
       'fq': 2,
       'facet.field': 3,
       'sort': 4
   }
   vs = [weights[k] for k in queries if k in weights]
   return sum(vs) / len(vs) if vs else 0
 

def canonical_url(url):
   parsed = urlparse.urlparse(url)
   queries = urlparse.parse_qs(parsed.query)
   queries = _clean_up(queries)
   return _weight(queries), parsed._replace(query=urllib.urlencode(queries, True)).geturl()

Enter fullscreen mode Exit fullscreen mode

Hitting both the endpoints

We have our production environment having current release. We replicate the production environment and deploy our new release on it. Then we hit both these endpoints with the selected URLs, collect the responses and compare them. Any anomaly ensures that the problem is in our code changes since the rest of the environment involving configs is constant.

Generating flattened JSON diff

We want the output of comparison testing to be generated in a way that is easily consumable by our Quality Assurance team. JSON diff is not a very efficient way of consuming the data, so we flatten the JSON diff and produce it in an easy to read format.

Plain JSON diff doesn’t provide enough context as shown below:

@@ -6,7 +6,6 @@
     "params": {
       "preferLocalShards": "true",
       "fl": "promotions:sale_keywords,alt_thumb_image:BR_STM_alt_thumb_image,description:long_desc,title:product_name,url:buy_url,brand:mfr_name_exact,pid:product_id,merchant_ap
-      "dominant_attributes": "{\"product\": {\"blouse\": 3020, \"tank\": 3020, \"shirt\": 3020, \"top\": 3020, \"tee\": 157, \"br_null\": 1, \"tshirt\": 3020}}",
       "attribute_profile": "",
       "user_has_profile": "false",
       "ltr_test": "spill",
@@ -27,7 +26,7 @@
       "facetRequestRewrite.enabled": "false",
       "start": "0",
       "rows": "10",
-      "psearch": "C2",
+      "psearch": "C1",
       "bannerCampaign.enabled": "false",
       "f.facet_dynamic_attr.facet.limit": "100000",
       "q": "top",
@@ -39,4 +38,4 @@
       "timeAllowed": "1600"
     }
   }
-}

Enter fullscreen mode Exit fullscreen mode

Flat JSON is generated with the following snippet:

def path_json(data, ignore_keys):
   return [u'{} = {}'.format(p, v)
           for p, v in jsonutils.pathitems(data, ignore_keys)]
        

def find_diff(prod, prod_replica, prod_url='prod.api.com', prod_replica_url='prod.replica.api.com'):
   if prod == prod_replica:
       return []
   “””Use OrderedDict for comparing order of json objects”””
   prod_path_json = path_json(prod_sorted, ignore_keys)
   prod_replica_path_json = path_json(prod_replica_sorted, ignore_keys)
   diffs = difflib.unified_diff(prod_path_json, prod_replica_path_json, fromfile=prod_url, tofile=prod_replica_url)
   return list(diffs)

Enter fullscreen mode Exit fullscreen mode

And the flattened diff looks like this, which is easy to understand:

--- prod.api.com
+++ prod.replica.api.com
@@ -1,12 +1,11 @@
 .responseHeader.QTime = 104
 .responseHeader.params.attribute_profile = ""
 .responseHeader.params.bose_test2 = "control"
-.responseHeader.params.dominant_attributes = "{\"product\": {\"blouse\": 3020, \"tank\": 3020, \"shirt\": 3020, \"top\": 3020, \"tee\": 157, \"br_null\": 1, \"tshirt\": 3020}}"
 .responseHeader.params.dominant_categories = "womensclothingtops|10,salewomensclothingtops|9"
 .responseHeader.params.fl = "promotions:sale_keywords,alt_thumb_image:BR_STM_alt_thumb_image,description:long_desc,title:product_name,url:buy_url,brand:mfr_name_exact,pid:produc
 .responseHeader.params.ltr_test = "spill"
 .responseHeader.params.preferLocalShards = "true"
-.responseHeader.params.psearch = "C2"
+.responseHeader.params.psearch = "C1"
 .responseHeader.params.q = "top"
 .responseHeader.params.rows = "10"
 .responseHeader.params.spellcheck = "false"

Enter fullscreen mode Exit fullscreen mode

Generating the comparison report

After generating the diff, we produce a consolidated report in a table format, which is easy to understand and analyze for our QA team. Below is a sample report which shows the number of requests, number of matched and unmatched responses, number of exceptions per merchant. In the case of unmatches, we try to capture which part of the response had the unmatches, for example, header, response body, numFound, in facet count or some other part. The hyperlink at merchant’s name redirects to the flattened JSON diff page for individual responses for that merchant.

Sample consolidated report

Along with random URLs, we also enrich the request set with a golden set of important URLs which involves new features, sparse requests, etc.

Reduced deployment time by 50%

In this blog, we have discussed the comparison testing of JSON APIs done by Bloomreach as part of our release management. We hit the production environment and a replicated production environment having a new release with a carefully selected set of requests and compare their responses to ensure confidence in our release. Before the introduction of comparison testing, we had limited insight into what our system state would be after release deployment and many times, it resulted in release reversion. After the introduction of this testing in CI, there has been no release reversion and the deployment time has reduced by 50% as this automated testing has reduced a lot of manual QA time.

Blog written by: Pankhuri from Bloomreach, 2021

💖 💪 🙅 🚩
rorywarin
Rory Warin

Posted on June 6, 2023

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related