Extracting Data from Multiple Objects in a JSON Variable Using SQL: A Comprehensive Guide

Extracting Data from Multiple Objects in a JSON Variable Using SQL

As the amount of data stored in relational databases continues to grow, many organizations are turning to NoSQL databases and JSON data types as an alternative storage solution. One common use case for JSON data is storing and querying large amounts of unstructured data, such as configuration files, logs, or even entire web pages.

However, when working with JSON data in SQL, one of the most challenging tasks is extracting data from multiple objects within a single variable. In this article, we will explore two methods to achieve this goal using the OPENJSON() function and its associated operators.

Introduction to OPENJSON()

The OPENJSON() function is a powerful tool in SQL Server that allows you to parse JSON data into a relational table structure. It takes two main arguments: the JSON variable to be parsed, and a schema definition that specifies which columns to extract from each nested object.

Basic Usage

Here’s an example of how to use OPENJSON() with a default schema:

SELECT *
from OPENJSON(@json, '$.result')
with(
    id nvarchar(50) '$.management_account_id',
    lbl nvarchar(50) '$.management_account_label'
);

In this example, the @json variable contains a JSON string with multiple objects nested under the .result key. The schema definition extracts two columns: id and lbl.

Understanding the Schema Definition

The schema definition is where you specify which columns to extract from each nested object. Here’s a breakdown of what each part does:

  • $: This symbol refers to the top-level JSON object.
  • result: This specifies that we want to parse the .result key in our JSON data.
  • id nvarchar(50) '$.management_account_id': This extracts the management_account_id column from each nested object and assigns it a data type of nvarchar(50).
  • lbl nvarchar(50) '$.management_account_label': This extracts the management_account_label column from each nested object and assigns it a data type of nvarchar(50).

Using OPENJSON() with Multiple Objects

However, when dealing with multiple objects in your JSON variable, you can’t simply use a single schema definition like the one shown above. That’s where things get more complicated.

To extract data from multiple objects, you’ll need to combine two techniques: using OPENJSON() with default schema and an additional APPLY operator, or using OPENJSON() with explicit schema and another APPLY operator.

Method 1: Using OPENJSON() with Default Schema and APPLY

Here’s how you can use OPENJSON() with default schema and an additional APPLY operator:

SELECT j2.*
FROM OPENJSON(@json, '$.result') j1
OUTER APPLY OPENJSON(j1.[value]) WITH (
    id nvarchar(50) '$.management_account_id',
    lbl nvarchar(50) '$.management_account_label'
) j2

In this example, we first extract all the values in the .result key using OPENJSON() with default schema. Then, we use an additional APPLY operator to parse each value into a separate table. The resulting columns are then combined to create our final result set.

Method 2: Using OPENJSON() with Explicit Schema and APPLY

Alternatively, you can also use OPENJSON() with explicit schema and another APPLY operator:

SELECT j3.*
FROM OPENJSON(@json, '$.result') j1
OUTER APPLY (
    SELECT id nvarchar(50) '$.management_account_id',
           lbl nvarchar(50) '$.management_account_label'
    FROM j1.[value]
) AS j2

In this case, we’re using an inner query to explicitly define the schema for each nested object.

Understanding the Results

Let’s take a closer look at what happens when you run these queries.

When you use OPENJSON() with default schema and an additional APPLY operator:

SELECT j2.*
FROM OPENJSON(@json, '$.result') j1
OUTER APPLY OPENJSON(j1.[value]) WITH (
    id nvarchar(50) '$.management_account_id',
    lbl nvarchar(50) '$.management_account_label'
) j2
  • The OPENJSON() function extracts all the values in the .result key into a table with columns key, value, and type.
  • The additional APPLY operator then parses each value in the [value] column, creating separate tables for each nested object.
  • Finally, the outer query selects only the desired columns (id and lbl) from each of these separate tables.

On the other hand, when you use OPENJSON() with explicit schema and another APPLY operator:

SELECT j3.*
FROM OPENJSON(@json, '$.result') j1
OUTER APPLY (
    SELECT id nvarchar(50) '$.management_account_id',
           lbl nvarchar(50) '$.management_account_label'
    FROM j1.[value]
) AS j2
  • The inner query explicitly defines the schema for each nested object, using the same columns (id and lbl) as in the previous example.
  • However, instead of creating separate tables for each value, an additional APPLY operator is used to create a single table that contains all the desired columns from each nested object.

Conclusion

In this article, we explored two methods to extract data from multiple objects in a JSON variable using SQL. Whether you choose to use OPENJSON() with default schema and an additional APPLY operator or explicit schema and another APPLY operator, both techniques allow you to parse complex JSON data into a relational table structure.

When deciding which method to use, consider the following factors:

  • Complexity of your JSON data: If your JSON variable contains many nested objects with varying levels of complexity, using explicit schema and multiple APPLY operators may be a better choice.
  • Performance requirements: Using default schema and an additional APPLY operator can be faster for smaller datasets or less complex JSON structures.
  • Readability and maintainability: Choose the method that makes your code more readable and maintainable, especially if you’re working with a large team or on a complex project.

By mastering these techniques, you’ll be able to work efficiently with JSON data in SQL Server and unlock its full potential.


Last modified on 2024-01-01