Creating Nested JSON Files from Pandas DataFrames in Python: A Step-by-Step Guide

Creating a Nested JSON File from a Pandas DataFrame in Python

==============================================

In this article, we will explore how to create a nested JSON file from a Pandas DataFrame in Python. We’ll cover the basics of Pandas, JSON, and Python’s string formatting capabilities.

Introduction to Pandas


Pandas is a powerful library for data manipulation and analysis in Python. It provides data structures such as Series (1-dimensional labeled array) and DataFrames (2-dimensional labeled data structure with columns of potentially different types).

A DataFrame is similar to an Excel spreadsheet or a table in a relational database. It consists of rows and columns, where each column represents a variable and each row represents an observation.

Introduction to JSON


JSON (JavaScript Object Notation) is a lightweight data interchange format that is easy to read and write. It is commonly used for exchanging data between web servers and web applications.

A JSON object consists of key-value pairs, where keys are strings and values can be strings, numbers, booleans, arrays, or other JSON objects.

Introduction to Python’s String Formatting Capabilities


Python has a powerful string formatting system that allows us to insert values into strings. We’ll use this feature to create the desired JSON template.

Creating a Nested JSON File from a Pandas DataFrame


To create a nested JSON file from a Pandas DataFrame, we need to:

  1. Create a JSON template with placeholders for the data.
  2. Iterate over each row in the DataFrame and replace the placeholders with the corresponding values.
  3. Convert the resulting string to a Python object using json.loads().
  4. Store the result in a list.

Step-by-Step Example


Here’s an example code snippet that demonstrates how to create a nested JSON file from a Pandas DataFrame:

d = """{
    "Ord" : "%s",
    "MOT" : "%s",
    "MVT" : "%s",
    "CUST" : "%s",
    "milestone" : {
        "creation" : {
            "sla" : "%s",
            "plan" : "%s",
            "proposed" : "%s"
        },
        "Pickup" : {
            "sla" : "%s",
            "plan" : "%s",
            "proposed" : "%s"
        }
    }
}
"""

js = []

for item in df.values:
    js.append(json.loads(d%tuple(item.tolist())))

print(json.dumps(js))

In this example, we define a JSON template d with placeholders for the data. We then iterate over each row in the DataFrame df, replace the placeholders with the corresponding values using string formatting (%s), and convert the resulting string to a Python object using json.loads().

Output


The output of this code snippet is a list of JSON objects, where each object represents a single row in the DataFrame. The nested structure is preserved, as expected:

[{"Ord": "a", "MOT": "TR", "MVT": "TT", "CUST": "DEA", "milestone": {"creation": {"sla": "12-3-2020", "plan": "12-3-2020", "proposed": "12-3-2020"}, "Pickup": {"sla": "14-3-2020", "plan": "14-3-2020", "proposed": "14-3-2020"}}}, {"Ord": "b", "MOT": "ZR", "MVT": "TD", "CUST": "DET", "milestone": {"creation": {"sla": "15-3-2020", "plan": "15-3-2020", "proposed": "15-3-2020"}, "Pickup": {"sla": "16-3-2020", "plan": "16-3-2020", "proposed": "16-3-2020"}}}]

This output can be used as-is, or further processed to suit your needs.

Conclusion


In this article, we explored how to create a nested JSON file from a Pandas DataFrame in Python. We covered the basics of Pandas, JSON, and Python’s string formatting capabilities, and demonstrated a step-by-step example of how to achieve this task. With this knowledge, you should be able to create your own nested JSON files from Pandas DataFrames with ease.

Additional Tips


  • Make sure to handle missing values in the DataFrame before creating the JSON file.
  • Use json.dumps() to convert the list of JSON objects to a single string, if needed.
  • Consider using libraries like pandas_json or pandas_xml for more advanced JSON and XML handling capabilities.

Last modified on 2024-08-26