Aggregating Atomic Data with Python: A Pandas Approach to Atom-Specific Statistics

Based on the provided output, I will write a Python solution using Pandas.

import pandas as pd

# Define data
data = {
    'Atom': ['5.H6', '6.H6', '7.H8', '8.H6', '5.H6', '9.H8', '8.H6', '10.H6', '12.H6', '13.H6', '14.H6', '16.H8', '17.H8', '18.H6', '19.H8', '20.H8', '21.H8'],
    'ppm': [7.891, 7.693, 8.16859, 7.446, 7.72158, 8.1053, 7.65014, 7.54, 8.067, 8.047, 7.69624, 8.27957, 7.169, 7.385, 7.657, 7.78512, 8.06057],
    'unclear': [0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3]
}

# Create DataFrame
df = pd.DataFrame(data)

# Group by 'Atom' and aggregate using 'count' and 'mean'
gb = df.groupby('Atom')['ppm'].agg(['count', 'mean']).rename(columns={'count': 'nVa', 'mean': 'avgppm'})

# Drop multi-level column structure if necessary
gb.columns = gb.columns.droplevel()

print(gb)

When you run this code, it will print the aggregated values for each unique atom. This solution uses Pandas to create a DataFrame and perform data manipulation operations. The groupby method is used to group the data by ‘Atom’, and then aggregation functions (‘count’ and ‘mean’) are applied to calculate the desired statistics.


Last modified on 2024-12-28