Pairing Lego Pieces Based on Measurement and Colour: A Step-by-Step Solution Using R

Pairing Lego Pieces Based on Measurement and Colour

In this article, we will explore a real-world problem of pairing Lego pieces based on their measurements and colours. We will break down the solution step by step and provide explanations for each part.

Introduction

The problem at hand involves creating pairs of Lego pieces that are in the same set, have the same colour, and are within 2 mm of each other in terms of length. The goal is to create a new column, Pair, where pairs are numbered sequentially in each piece set.

Problem Statement

Given a dataset containing various measurements of toys, including the unique toy set (Piece_ID), the colour of a specific toy (Colour), and the length of the toy (Length_mm). We need to generate a new column, Pairs, based on these measurements by unique Piece_ID and Colour groups. The pairing should consider Lego pieces as a pair if they are in the same set, are the same colour, and the length of both pieces are within 2 mm of each other.

Approach

Our approach involves creating pairs of Lego pieces using a combination of logical operations and iterative processing. We will use R programming language to solve this problem.

Step 1: Creating DataFrames

We start by reading in the dataset into two separate dataframes, df and pairs. The code for this step is as follows:

table <- "Piece_ID Width_mm   Length_mm  Colour
1  A     1.68 3.19 Unknown
2  A     1.47 2.88 Blue
3  A     1.64 2.90 Blue
4  A     1.80 3.20 Unknown
5  B     1.76 3.12 Red
6  B     1.61 3.11 Red
7  B     1.57 3.51 Blue
8  B     1.48 3.54 Blue
9  A     1.46 4.05 Green"

#Create a dataframe with the above table
df <- read.table(text=table, header = TRUE)

pairs <- read.table(text="Piece_ID Width_mm   Length_mm  Colour
1  A     1.68 3.19 Unknown
2  A     1.47 2.88 Blue
3  A     1.64 2.90 Blue
4  A     1.80 3.20 Unknown
5  B     1.76 3.12 Red
6  B     1.61 3.11 Red
7  B     1.57 3.51 Blue
8  B     1.48 3.54 Blue
9  A     1.46 4.05 Green", header = TRUE)

Step 2: Initializing Variables

We initialize several variables to keep track of our progress:

df$pair <- Inf
df$pair[[1]] <- 1
df$abs_diff <- NA 
seen_vec <- character(0)

Step 3: Iterative Pairing Process

We then iterate through each row in the df dataframe. If it is not the first row, we calculate the absolute difference between the current length and the previous length:

if(i!=1) {
   df$abs_diff[[i]] <- round(abs(df$Length_mm[[i]]-df$Length_mm[[i-1]]),2)
   if(! df$Piece_ID[[i]] %in% seen_vec) {
     df$pair[[i]] <- 1
     seen_vec <- c(seen_vec,df$Piece_ID[[i]])
   } else 
   if(
      df$Colour[[i]]!=df$Colour[[i-1]] 
    ){
     df$pair[[i]] <-    df$pair[[i-1 ]]  + 1 
   } else if( df$abs_diff[[i]]>0.02){
     df$pair[[i]] <-    df$pair[[i-1 ]]  + 1 
   } else 
     {
     df$pair[[i]] <-    df$pair[[i-1 ]]
   }

 } else {
   seen_vec <- df$Piece_ID[[i]]
 }

Step 4: Final Output

After iterating through each row, we output the final df dataframe.

df

The resulting dataframe will contain a new column called Pair, where pairs of Lego pieces are numbered sequentially in each piece set.


Last modified on 2024-01-18