>[!abstract] >Kaprekar's constant is the number 6174, named after the Indian mathematician D. R. Kaprekar who identified the following interesting property about it in 1955: > >1. Pick any natural number $N$ less than 10,000, padding it with leading zeros as needed to make it a four-digit number. It must have at least two different digits. >2. Sort the digits in descending and then in ascending order to form two four-digit numbers, padding them with leading zeros as needed. >3. Subtract the smaller number from the bigger number. >4. Go back to step 2 and repeat. > >The above procedure, known as Kaprekar's routine, will always reach the number 6174 in no more than 7 iterations. Iterating further will continue yielding 6174, which is a *fixed point* of the routine. > >There is nothing special about the number 6174, however. There are infinitely many procedures that converge to a fixed point, oscillate between two values, or diverge with no repeating pattern. Kaprekar's constant is just well known because of its simplicity and guaranteed convergence to a relatively small number. >[!example] Example with $N = 1549$ >$9541 - 1459 = 8082$ >$8820 - 0288 = 8532$ >$8532 - 2358 = 6174$ >$7641 - 1467 = 6174$ >$\ldots$ ## Why does it converge to 6174? Let $N \in \mathbb{Z}$ such that $0001 \leq N \leq 9999$ and $N \notin \{1111, \ldots, 9999\}$. (WIP) ### Translating Kaprekar’s routine into R Let’s first define a function that applies Kaprekar’s routine once to an input value $n$. I could have used the `sprintf` function to pad $n$ with leading zeros where necessary, but that would have required a type conversion to `character` and then back to `integer` again. It feels cleaner to use integer division and modulo operations to split the digits of $n$ directly into a vector of four `integer` elements. ``` r k_step <- function(n) { # Check that n is a positive integer smaller than 10^4 stopifnot(n >= 1, n <= 9999, n == floor(n)) # Split the digits of n and pad them with leading zeros if necessary digits <- n %/% 10^(3:0) %% 10 # Sort digits in descending order to form the largest possible number dig_hi <- sort(digits, decreasing = TRUE) # Sort digits in ascending order to form the smallest possible number dig_lo <- sort(digits, decreasing = FALSE) # Return the difference between those two numbers sum(dig_hi * 10^(3:0) - dig_lo * 10^(3:0)) } ``` We also need a function to check if all digits of $n$ are identical, as such numbers do not converge to Kaprekar’s constant and will need special treatment. ``` r k_check <- function(n) { # Split the digits of n and pad them with leading zeros if necessary digits <- n %/% 10^(3:0) %% 10 # Return TRUE if all digits are the same all(digits == digits[1]) } ``` We can now define a function that counts iterations needed to reach Kaprekar’s constant from a given input number $n$. Note that I use R’s `L` shorthand notation to ensure that the iteration count `i` is stored as an integer and not a double. ``` r k_iter <- function(n, target = k_c) { # Check if n has all identical digits, return NA if so if (k_check(n)) return(NA_integer_) # Initialize iteration count i <- 0L # Apply Kaprekar's routine until we reach the constant while(n != 6174) { n <- k_step(n) i <- i + 1L } # Return the number of iterations i } ``` ### Applying the routine to all values of n \< 10,000 With our functions defined, we can now compute the count of iterations for all values of $n$ less than 10^4. The results are stored in a dataframe with two columns: `n` (padded with leading zeros) and `i` (the number of iterations needed to reach Kaprekar’s constant). ``` r # Define the range of n values as a vector n_range <- 0:9999 # Apply the k_iter function to the range vector i <- vapply(n_range, k_iter, integer(1)) # Write the results to a dataframe df <- data.frame(n = sprintf("%04d", n_range), i = i) ``` ### Exploring the iteration counts Let’s show how many values of $n$ converge to Kaprekar’s constant for every possible iteration count. We see that only one value needs zero iteration (Kaprekar’s constant itself) since it already has the value `6174`. The mode (the most frequent count of iterations) is 3, and the maximum number of iterations needed is 7. ``` r df_summary <- as.data.frame(table(df[[2]], useNA = "ifany")) names(df_summary) <- c("Iterations", "Count of n") knitr::kable(df_summary, format = "pipe") ``` | Iterations | Count of n | |:-----------|-----------:| | 0 | 1 | | 1 | 383 | | 2 | 576 | | 3 | 2400 | | 4 | 1272 | | 5 | 1518 | | 6 | 1656 | | 7 | 2184 | | NA | 10 | Note also the nine `NA` values which correspond to numbers that do not converge. We can confirm that they are those with repeating digits as follows. ``` r cat(df$n[is.na(df$i)]) ``` 0000 1111 2222 3333 4444 5555 6666 7777 8888 9999 Let’s also plot this distribution of iterations required for convergence on a bar chart. ``` r ggplot(na.omit(df), aes(x = factor(i))) + geom_bar(fill = "#acbf87", color = NA) + geom_text(stat = "count", aes(label = after_stat(count)), vjust = -0.5, size = 4, color = "#D1C6AD") + labs(x = "Count of iterations", y = "Count of n") + theme( axis.line = element_blank(), axis.text.x = element_text(color = "#D1C6AD", vjust = 5.5), axis.text.y = element_blank(), axis.ticks = element_blank(), axis.title = element_text(color = "#D1C6AD"), panel.grid = element_blank(), panel.background = element_rect(fill = "transparent", color = NA), plot.background = element_rect(fill = "transparent", color = NA), text = element_text(size = 14) ) ``` ![Distribution of iteration counts](includes/assets/notes/kaprekar's+constant+histogram-1.png) Next, we create a dataframe that contains the first two and last two digits of all 10,000 values of $n$ (0000—9999) as separate columns. ``` r y <- as.integer(n_range %/% 100) # First two digits of n x <- as.integer(n_range %% 100) # Last two digits of n # Create a dataframe with n, y, and x (all padded with zeros), and i df <- data.frame( n = sprintf("%04d", n_range), y = y, x = x, i = i ) # Pad the values with leading zeros for better readability df_show <- data.frame( n = sprintf("%04d", n_range), y = sprintf("%02d", y), x = sprintf("%02d", x), i = i ) # Create a spacer row with ellipses for better readability spacer <- as.data.frame(as.list(rep("…", ncol(df_show))), stringsAsFactors = FALSE) names(spacer) <- names(df_show) # Show the first and last few rows of the dataframe knitr::kable( rbind(head(df_show, 4), spacer, tail(df_show, 4)), format = "pipe", row.names = FALSE ) ``` | n | y | x | i | |:-----|:----|:----|:----| | 0000 | 00 | 00 | NA | | 0001 | 00 | 01 | 5 | | 0002 | 00 | 02 | 4 | | 0003 | 00 | 03 | 6 | | … | … | … | … | | 9996 | 99 | 96 | 6 | | 9997 | 99 | 97 | 4 | | 9998 | 99 | 98 | 5 | | 9999 | 99 | 99 | NA | We can now plot the count of iterations $i$ in a 100x100 grid format, where the x-axis represents the last two digits and the y-axis represents the first two digits of each $n$. ``` r # Pick a color palette with discrete steps for each iteration count, red for NA max_i <- max(df$i, na.rm = TRUE) levels_i <- 0:max_i df$i_f <- factor(df$i, levels = levels_i) ``` ``` r # Plot (y reversed so 0000 appears at top-left) ggplot(df, aes(x = x, y = y, fill = i_f)) + geom_raster() + scale_y_reverse(expand = c(0, 0)) + scale_x_discrete(expand = c(0, 0)) + coord_fixed() + scale_fill_viridis_d( option = "D", na.value = "red", # red for non-converging numbers drop = FALSE, name = "Iterations" ) + labs( title = "Plot of iterations to reach 6174", x = "Last two digits of n (00–99)", y = "First two digits of n (00–99)" ) + theme_minimal(base_size = 12) + theme( axis.line = element_blank(), axis.text.x = element_text(color = "#D1C6AD"), axis.text.y = element_blank(), axis.ticks = element_blank(), axis.title = element_text(color = "#D1C6AD"), panel.grid = element_blank(), panel.background = element_rect(fill = "transparent", color = NA), plot.background = element_rect(fill = "transparent", color = NA), plot.title = element_text(color = "#D1C6AD", face = "bold"), text = element_text(size = 14, color = "#D1C6AD") ) ``` ![Kaprekar iterations to reach 6174](includes/assets/notes/kaprekar's+constant+pattern-1.png)