Improving Data Consistency in Flask Web Application: The Power of Global Variables

Problem Explanation

The problem is related to a web application built using Flask, where data from one function is not being reflected in another due to the way variables are handled.

Solution Explanation

To solve this issue, we need to declare merged as a global variable before it’s used inside any function. We can do this by adding global merged at the beginning of both functions, data_prediction and read_uploaded_file.

Here’s how you should modify your code:

from flask import Flask, render_template, request, send_from_directory
import io
import matplotlib.pyplot as plt
from werkzeug.utils import secure_filename

app = Flask(__name__)

# Initialize an empty dataframe to hold merged data
merged=pd.DataFrame()

def data_prediction(filename):
    global merged
    
    model_name = 'SVM.sav'
    SVM = pickle.load(open(model_name, 'rb'))

    df_prediction = pd.read_csv(filename,encoding = "ISO-8859-1")
    df_prediction = df_prediction.applymap(lambda x: x.strip() if isinstance(x, str) else x)

    df_prediction["Description"].fillna(" ", inplace = True) 
    df_prediction['Full description'] = df_prediction['Short Description'] + " " +  df_prediction['Description']
    X_predict = df_prediction['Full description']
    
    documents_predict = []

    for sen in range(0, len(X_predict)):
        # Remove all the special characters
        document = re.sub(r'\W', ' ', str(X_predict[sen]))

        # remove all single characters
        document = re.sub(r'\s+[a-zA-Z]\s+', ' ', document)

        # Remove single characters from the start
        document = re.sub(r'\^[a-zA-Z]\s+', ' ', document) 

        # Substituting multiple spaces with single space
        document = re.sub(r'\s+', ' ', document, flags=re.I)

        # Removing prefixed 'b'
        document = re.sub(r'^b\s+', '', document)

        # Converting to Lowercase
        document = document.lower()

        documents_predict.append(document)

    data_for_predict = pd.Series(documents_predict)
    predicted_svm_actual_data = SVM.predict(data_for_predict.values.astype('U'))
    output=pd.DataFrame(data={"Description":data_for_predict,"Prediction":predicted_svm_actual_data})

    merged = pd.merge(left=df_prediction, left_index=True,right=output, right_index=True,how='inner')
    columns = ['Description_x', 'Description_y']
    merged.drop(columns, inplace=True, axis=1)
    
    # Provide the name of output file. it will contain the description and predicted output/category
    writer = pd.ExcelWriter(r"predicted_output.xlsx", engine='xlsxwriter')
    merged.to_excel(writer, sheet_name='Sheet1', index=False)
    writer.save()
    
    print('HHH')
    print(merged)
    
    return merged

@app.route('/read_file', methods=['GET'])
def read_uploaded_file():
    global merged
    
    filename = secure_filename(request.args.get('filename'))
    product = request.args.get("product")

    try:
        if filename and allowed_file(filename):
            if(product=='itanywhere'):
                print('itanywhere is happening')
                merged = data_prediction(filename)                
    except IOError:
        pass
    send_from_directory(directory=UPLOAD_FOLDER, filename='predicted_output.xlsx')

    count_by_prediction = merged.groupby('Prediction').count()[['Incident Number']].sort_values(by=['Incident Number'],ascending=False)
    display(merged)
    plt.figure(figsize = (20,8))
    plt.xticks(rotation=90)
    #plt.tight_layout()
    sn.countplot('Prediction', data=merged)
    img = io.BytesIO()  # create the buffer
    plt.savefig(img, format='png',bbox_inches = "tight")  # save figure to the buffer
    img.seek(0)  # rewind your buffer
    plot_data = urllib.parse.quote(base64.b64encode(img.read()).decode()) # base64 encode & URL-escape
    return render_template('data.html',plot_url=plot_data,tables_summary=[count_by_prediction.to_html(classes='data')], titles_summary=count_by_prediction.columns.values,\
                           tables=[merged.to_html(classes='data')], titles=merged.columns.values)




if __name__ == "__main__":
    app.run(host='0.0.0.0')

Now merged should be accessible from both functions, making sure the data displayed on different pages is consistent and accurate.

Further Improvements

You may need to check if your uploaded files are being processed correctly and any errors that occur during this process might affect the display of the results.

Also note that using global variables can make code harder to read and maintain. In larger applications, it’s often better to pass data between functions as parameters or use a more structured approach like object-oriented programming.


Last modified on 2024-12-31