Pairing Lego Pieces Based on Measurement and Colour
In this article, we will explore a real-world problem of pairing Lego pieces based on their measurements and colours. We will break down the solution step by step and provide explanations for each part.
Introduction
The problem at hand involves creating pairs of Lego pieces that are in the same set, have the same colour, and are within 2 mm of each other in terms of length. The goal is to create a new column, Pair
, where pairs are numbered sequentially in each piece set.
Problem Statement
Given a dataset containing various measurements of toys, including the unique toy set (Piece_ID
), the colour of a specific toy (Colour
), and the length of the toy (Length_mm
). We need to generate a new column, Pairs
, based on these measurements by unique Piece_ID
and Colour
groups. The pairing should consider Lego pieces as a pair if they are in the same set, are the same colour, and the length of both pieces are within 2 mm of each other.
Approach
Our approach involves creating pairs of Lego pieces using a combination of logical operations and iterative processing. We will use R programming language to solve this problem.
Step 1: Creating DataFrames
We start by reading in the dataset into two separate dataframes, df
and pairs
. The code for this step is as follows:
table <- "Piece_ID Width_mm Length_mm Colour
1 A 1.68 3.19 Unknown
2 A 1.47 2.88 Blue
3 A 1.64 2.90 Blue
4 A 1.80 3.20 Unknown
5 B 1.76 3.12 Red
6 B 1.61 3.11 Red
7 B 1.57 3.51 Blue
8 B 1.48 3.54 Blue
9 A 1.46 4.05 Green"
#Create a dataframe with the above table
df <- read.table(text=table, header = TRUE)
pairs <- read.table(text="Piece_ID Width_mm Length_mm Colour
1 A 1.68 3.19 Unknown
2 A 1.47 2.88 Blue
3 A 1.64 2.90 Blue
4 A 1.80 3.20 Unknown
5 B 1.76 3.12 Red
6 B 1.61 3.11 Red
7 B 1.57 3.51 Blue
8 B 1.48 3.54 Blue
9 A 1.46 4.05 Green", header = TRUE)
Step 2: Initializing Variables
We initialize several variables to keep track of our progress:
df$pair <- Inf
df$pair[[1]] <- 1
df$abs_diff <- NA
seen_vec <- character(0)
Step 3: Iterative Pairing Process
We then iterate through each row in the df
dataframe. If it is not the first row, we calculate the absolute difference between the current length and the previous length:
if(i!=1) {
df$abs_diff[[i]] <- round(abs(df$Length_mm[[i]]-df$Length_mm[[i-1]]),2)
if(! df$Piece_ID[[i]] %in% seen_vec) {
df$pair[[i]] <- 1
seen_vec <- c(seen_vec,df$Piece_ID[[i]])
} else
if(
df$Colour[[i]]!=df$Colour[[i-1]]
){
df$pair[[i]] <- df$pair[[i-1 ]] + 1
} else if( df$abs_diff[[i]]>0.02){
df$pair[[i]] <- df$pair[[i-1 ]] + 1
} else
{
df$pair[[i]] <- df$pair[[i-1 ]]
}
} else {
seen_vec <- df$Piece_ID[[i]]
}
Step 4: Final Output
After iterating through each row, we output the final df
dataframe.
df
The resulting dataframe will contain a new column called Pair
, where pairs of Lego pieces are numbered sequentially in each piece set.
Last modified on 2024-01-18