Last-Mile Logistics
  • Optimizing Delivery Partner Allocation in Last-Mile Logistics
  • Dataset
  • Delivery Data Preprocessing
  • Data Prepration Guide
  • Preprocessing Using Power BI and Excel
  • Pickup Data Preprocessing
  • Data Merging Guide
  • Approximate Distances and Outlier Detection
  • Dashboard-I
  • SLA Compliance Calculation
  • Dashboard-II
  • Dashboard-III
  • Dashboard-IV
    • Delivery KPIs Issue
  • Dashboard-V
  • Dashboard-VI
  • Advanced Data Preparation
  • Dashboard VII
  • Data Handling
  • ETA Creation
  • Dashboard VIII
  • Final Analysis Report
  • Machine Learning Team Report
Powered by GitBook
On this page

Preprocessing Using Power BI and Excel

PreviousData Prepration GuideNextPickup Data Preprocessing

Last updated 9 months ago

CtrlK
  • 1. Data Preprocessing in Power BI
  • 2. Data Preprocessing in Excel
  • Next Steps

1. Data Preprocessing in Power BI

Steps to Follow:

  1. Load Data into Power BI:

    • Open Power BI Desktop.

    • Use Get Data to import all five city datasets.

    • Append the datasets by selecting Home > Transform Data > Append Queries.

  2. Add a City Column:

    • If the city column is missing, use the Add Column > Custom Column feature in Power Query to include the city name.

  3. Remove Duplicates:

    • In Power Query, use Remove Duplicates for package_id to ensure no duplicate rows.

  4. Handle Missing Values:

    • Replace or remove null values using Transform > Replace Values or Remove Rows.

  5. Create Delivery Duration:

    • Add a custom column in Power Query to calculate delivery duration:

      Duration = [delivery_time] - [accept_time]
  6. Compute Geographical Distance:

    • If distance calculations are needed, you can use DAX formulas after loading the data into the model, or preprocess using approximate calculations for now.

  7. Filter Outliers:

    • Visualize delivery durations and distances in Power BI visuals.

    • Use filters to exclude unrealistic values.

  8. Save the Processed Data:

    • Apply all changes in Power Query, and use Export Data from visuals if a cleaned file is needed.


2. Data Preprocessing in Excel

Steps to Follow:

  1. Merge Data:

    • Copy and paste each city dataset into a master workbook.

    • Add a column to each dataset specifying the city name.

  2. Remove Duplicates:

    • Use Data > Remove Duplicates on package_id.

  3. Handle Missing Values:

    • Use filters or Conditional Formatting to identify and replace/remove null values.

  4. Calculate Delivery Duration:

    • Create a new column:

    • Format the column as time.

  5. Approximate Distance Calculation:

    • Add a formula for Euclidean distance (simplified for flat surfaces):

  6. Detect Outliers:

    • Use Conditional Formatting or a pivot table to highlight extremely high/low delivery durations or distances.

  7. Save the Processed File:

    • Save the cleaned file as a new workbook.


Next Steps

Once the preprocessing is completed in Power BI or Excel:

  1. Validate the data quality by reviewing summary statistics.

  2. Use this cleaned dataset for analysis and modeling.

=delivery_time - accept_time
=SQRT((delivery_gps_lng - accept_gps_lng)^2 + (delivery_gps_lat - accept_gps_lat)^2)