>[!abstract]
>Kaprekar's constant is the number 6174, named after the Indian mathematician D. R. Kaprekar who identified the following interesting property about it in 1955:
>
>1. Pick any natural number $N$ less than 10,000, padding it with leading zeros as needed to make it a four-digit number. It must have at least two different digits.
>2. Sort the digits in descending and then in ascending order to form two four-digit numbers, padding them with leading zeros as needed.
>3. Subtract the smaller number from the bigger number.
>4. Go back to step 2 and repeat.
>
>The above procedure, known as Kaprekar's routine, will always reach the number 6174 in no more than 7 iterations. Iterating further will continue yielding 6174, which is a *fixed point* of the routine.
>
>There is nothing special about the number 6174, however. There are infinitely many procedures that converge to a fixed point, oscillate between two values, or diverge with no repeating pattern. Kaprekar's constant is just well known because of its simplicity and guaranteed convergence to a relatively small number.
>[!example] Example with $N = 1549$
>$9541 - 1459 = 8082$
>$8820 - 0288 = 8532$
>$8532 - 2358 = 6174$
>$7641 - 1467 = 6174$
>$\ldots$
## Why does it converge to 6174?
Let $N \in \mathbb{Z}$ such that $0001 \leq N \leq 9999$ and $N \notin \{1111, \ldots, 9999\}$.
(WIP)
### Translating Kaprekar’s routine into R
Let’s first define a function that applies Kaprekar’s routine once to an input value $n$. I could have used the `sprintf` function to pad $n$ with leading zeros where necessary, but that would have required a type conversion to `character` and then back to `integer` again. It feels cleaner to use integer division and modulo operations to split the digits of $n$ directly into a vector of four `integer` elements.
``` r
k_step <- function(n) {
# Check that n is a positive integer smaller than 10^4
stopifnot(n >= 1, n <= 9999, n == floor(n))
# Split the digits of n and pad them with leading zeros if necessary
digits <- n %/% 10^(3:0) %% 10
# Sort digits in descending order to form the largest possible number
dig_hi <- sort(digits, decreasing = TRUE)
# Sort digits in ascending order to form the smallest possible number
dig_lo <- sort(digits, decreasing = FALSE)
# Return the difference between those two numbers
sum(dig_hi * 10^(3:0) - dig_lo * 10^(3:0))
}
```
We also need a function to check if all digits of $n$ are identical, as such numbers do not converge to Kaprekar’s constant and will need special treatment.
``` r
k_check <- function(n) {
# Split the digits of n and pad them with leading zeros if necessary
digits <- n %/% 10^(3:0) %% 10
# Return TRUE if all digits are the same
all(digits == digits[1])
}
```
We can now define a function that counts iterations needed to reach Kaprekar’s constant from a given input number $n$. Note that I use R’s `L` shorthand notation to ensure that the iteration count `i` is stored as an integer and not a double.
``` r
k_iter <- function(n, target = k_c) {
# Check if n has all identical digits, return NA if so
if (k_check(n)) return(NA_integer_)
# Initialize iteration count
i <- 0L
# Apply Kaprekar's routine until we reach the constant
while(n != 6174) {
n <- k_step(n)
i <- i + 1L
}
# Return the number of iterations
i
}
```
### Applying the routine to all values of n \< 10,000
With our functions defined, we can now compute the count of iterations for all values of $n$ less than 10^4. The results are stored in a dataframe with two columns: `n` (padded with leading zeros) and `i` (the number of iterations needed to reach Kaprekar’s constant).
``` r
# Define the range of n values as a vector
n_range <- 0:9999
# Apply the k_iter function to the range vector
i <- vapply(n_range, k_iter, integer(1))
# Write the results to a dataframe
df <- data.frame(n = sprintf("%04d", n_range), i = i)
```
### Exploring the iteration counts
Let’s show how many values of $n$ converge to Kaprekar’s constant for every possible iteration count. We see that only one value needs zero iteration (Kaprekar’s constant itself) since it already has the value `6174`. The mode (the most frequent count of iterations) is 3, and the maximum number of iterations needed is 7.
``` r
df_summary <- as.data.frame(table(df[[2]], useNA = "ifany"))
names(df_summary) <- c("Iterations", "Count of n")
knitr::kable(df_summary, format = "pipe")
```
| Iterations | Count of n |
|:-----------|-----------:|
| 0 | 1 |
| 1 | 383 |
| 2 | 576 |
| 3 | 2400 |
| 4 | 1272 |
| 5 | 1518 |
| 6 | 1656 |
| 7 | 2184 |
| NA | 10 |
Note also the nine `NA` values which correspond to numbers that do not converge. We can confirm that they are those with repeating digits as follows.
``` r
cat(df$n[is.na(df$i)])
```
0000 1111 2222 3333 4444 5555 6666 7777 8888 9999
Let’s also plot this distribution of iterations required for convergence on a bar chart.
``` r
ggplot(na.omit(df), aes(x = factor(i))) +
geom_bar(fill = "#acbf87", color = NA) +
geom_text(stat = "count", aes(label = after_stat(count)),
vjust = -0.5, size = 4, color = "#D1C6AD") +
labs(x = "Count of iterations", y = "Count of n") +
theme(
axis.line = element_blank(),
axis.text.x = element_text(color = "#D1C6AD", vjust = 5.5),
axis.text.y = element_blank(),
axis.ticks = element_blank(),
axis.title = element_text(color = "#D1C6AD"),
panel.grid = element_blank(),
panel.background = element_rect(fill = "transparent", color = NA),
plot.background = element_rect(fill = "transparent", color = NA),
text = element_text(size = 14)
)
```

Next, we create a dataframe that contains the first two and last two digits of all 10,000 values of $n$ (0000—9999) as separate columns.
``` r
y <- as.integer(n_range %/% 100) # First two digits of n
x <- as.integer(n_range %% 100) # Last two digits of n
# Create a dataframe with n, y, and x (all padded with zeros), and i
df <- data.frame(
n = sprintf("%04d", n_range),
y = y,
x = x,
i = i
)
# Pad the values with leading zeros for better readability
df_show <- data.frame(
n = sprintf("%04d", n_range),
y = sprintf("%02d", y),
x = sprintf("%02d", x),
i = i
)
# Create a spacer row with ellipses for better readability
spacer <- as.data.frame(as.list(rep("…", ncol(df_show))), stringsAsFactors = FALSE)
names(spacer) <- names(df_show)
# Show the first and last few rows of the dataframe
knitr::kable(
rbind(head(df_show, 4), spacer, tail(df_show, 4)),
format = "pipe",
row.names = FALSE
)
```
| n | y | x | i |
|:-----|:----|:----|:----|
| 0000 | 00 | 00 | NA |
| 0001 | 00 | 01 | 5 |
| 0002 | 00 | 02 | 4 |
| 0003 | 00 | 03 | 6 |
| … | … | … | … |
| 9996 | 99 | 96 | 6 |
| 9997 | 99 | 97 | 4 |
| 9998 | 99 | 98 | 5 |
| 9999 | 99 | 99 | NA |
We can now plot the count of iterations $i$ in a 100x100 grid format, where the x-axis represents the last two digits and the y-axis represents the first two digits of each $n$.
``` r
# Pick a color palette with discrete steps for each iteration count, red for NA
max_i <- max(df$i, na.rm = TRUE)
levels_i <- 0:max_i
df$i_f <- factor(df$i, levels = levels_i)
```
``` r
# Plot (y reversed so 0000 appears at top-left)
ggplot(df, aes(x = x, y = y, fill = i_f)) +
geom_raster() +
scale_y_reverse(expand = c(0, 0)) +
scale_x_discrete(expand = c(0, 0)) +
coord_fixed() +
scale_fill_viridis_d(
option = "D",
na.value = "red", # red for non-converging numbers
drop = FALSE,
name = "Iterations"
) +
labs(
title = "Plot of iterations to reach 6174",
x = "Last two digits of n (00–99)",
y = "First two digits of n (00–99)"
) +
theme_minimal(base_size = 12) +
theme(
axis.line = element_blank(),
axis.text.x = element_text(color = "#D1C6AD"),
axis.text.y = element_blank(),
axis.ticks = element_blank(),
axis.title = element_text(color = "#D1C6AD"),
panel.grid = element_blank(),
panel.background = element_rect(fill = "transparent", color = NA),
plot.background = element_rect(fill = "transparent", color = NA),
plot.title = element_text(color = "#D1C6AD", face = "bold"),
text = element_text(size = 14, color = "#D1C6AD")
)
```
