+++
title = "Cheesing a subnetting test with Python"
date = 2025-11-02
description = "Breaking down the Python version of a script I wrote last year to blitz through subnetting tests for my computer networking uni course."

[taxonomies]
tags = ["Technical", "Python", "Networking"]

[extra]
katex = true
+++

## Background

Don't you hate it when you're supposed to perform a mundane task by hand, when you see an absurdly straightforward way of getting the computer to do pretty much all the dirty work for you?

That's exactly the kind of situation I found myself in last year, when our networking course tutor presented said manual approach, which made my head hurt due to the sheer number of subnets and memorisation of values for subnet mask octets.

I think it was reasonable for me to approach the test exactly like a software developer would: examining the problem, devising an algorithm to solve it, and of course translating said algorithm to code. In this post I'm looking to do that once again, only this time using Python to handle the last part.

[The original script](https://github.com/maciejpedzich/subnet-solver) was written in Rust, since I just so happened to be learning that language back then, and I thought it would be a neat little exercise to sharpen up my skills. You'll soon find out why porting it to Python turned out to be so beneficial and worthy of an article.

## Outlining the test formula

We're given a random IPv4 address along with the number of bits reserved for the host part of the base network - in other words, a random [CIDR (**C**lassless **I**nter-**D**omain **R**outing) notation](https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing#CIDR_notation). Then we have a comma-separated list of subnets to divide the base network into. Each entry is represented by the subnet's unique name (single uppercase letter), followed by a comma and the number of hosts that need to be connected to that network, with all these fields wrapped in parentheses.

Our first task is to determine the base network's address, subnet mask, broadcast address, and address pool size. Our second task is to divide the network such that the subnet with the largest pool size receives a chunk of the lowest addresses from the base pool, the second largest subnet gets a chunk that starts with the first address outside the previous pool, and so on until we've gone through all the entries.

If two subnets happen to have an identical address pool size, the tie is broken by falling back to alphabetical (or reverse alphabetical depending on the test version) order of their names.

## Explaining the algorithm by example

Suppose we have:

- a base CIDR of `2.137.69.4/20`
- a subnet list of `(P,510), (D,256), (H,2025), (W,873)`
- alphabetical tiebreak order

### Task 1

How do we go about doing the aforementioned tasks? The first one is insanely easy to cheese with Python, but I'm going to explain it step by step anyway, so that you get a better understanding of where the end result came from.

#### Address pool size

Let's tackle the address pool size first. Because the number on the right-hand side of the slash denotes the number of high-order (leftmost) bits that remain unchanged for each address in the pool, we can subtract it from 32 (IPv4 address's bit width) to obtain the number of bits that do change, or in other terms - identify a host on the network.

We can then raise 2 to the power of that difference to find out how many unique addresses we can assign using the number of bits specified in the exponent, which is precisely the address pool size we're looking for.

In our example case, the answer is: $$2^{32-20}=2^{12}=4096$$

#### Subnet mask

As I mentioned in the previous sub-subsection, the number to the right of the slash tells us how many high-order bits remain unchanged across all the addresses. Subnet mask is a number that has exactly as many of those bits set to 1, with all the remaining ones set to 0 in the binary representation.

In order to obtain an integer that will only have $n$ leftmost bits active, we can do some bit-shifting magic. The trick is to shift 1 by $n$ bits to the left and subtract 1 from the result to receive a number that has exactly $n$ least significant bits set to 1, and then shift that $32 - n$ bits to the left to move the active bits to the most significant positions.

The formula looks like like this: $$((1\ll{n})-1)\ll(32-n)$$

For $n=20$, it evaluates to $4,294,963,200$ (or $-4096$ as a signed integer), which in binary representation (grouped into octets) is: $$11111111.11111111.11110000.00000000$$

Looks good to me! Converting the octets back to decimal will give us the answer: $$255.255.240.0$$

#### Network address

Now we can find out what the network address is by performing a bitwise AND operation between the address from the CIDR and the subnet mask we've just determined. It will select the bits reserved for network part of the address and clear all the host bits.

$$\qquad\enspace\space00000010.10001001.01000101.00000100\newline\text{AND}\enspace11111111.11111111.11110000.00000000\newline\overset{\rule[2.25pt]{198pt}{0.1pt}}{\newline\qquad\enspace\space00000010.10001001.01000000.00000000}$$

The answer in decimal is: $$2.137.64.0$$

#### Broadcast address

For the final piece of the task 1 puzzle, we have to perform a bitwise OR operation between the network address we've just learned and the inverted subnet mask. This will activate all the previously cleared host bits, giving us the highest possible address in the process.

$$\qquad\enspace\space00000010.10001001.01000000.00000000\newline\text{OR}\quad\space00000000.00000000.00001111.11111111\newline\overset{\rule[2.25pt]{198pt}{0.1pt}}{\newline\qquad\enspace\space00000010.10001001.01001111.11111111}$$

The answer in decimal is: $$2.137.79.255$$

### Task 2

Alright, we've got everything we need to know about the base network, so we can move on to dividing it into smaller networks accordingly with the provided list. Just to remind you, the list is `(P,510), (D,256), (H,2025), (W,873)` and the tiebreak ordering by names is alphabetical.

#### Calculating subnets' address pool sizes

We can't do any sorting straightaway, because we first have to find out what each subnet's address pool size is. You may recall from the [address pool size section](#address-pool-size) that we raised 2 to the power of number of bits reserved for identifying hosts in our base network. This exponent is the key to our solution, but how can we get hold of it?

Ideally, we'd want to somehow do the inverse of exponentiation, where we'd plug in the number of addresses we need (all the hosts plus network and broadcast addresses) to this mysterious function, which would return the lowest integer exponent that generates a power of 2 greater than or equal to the input. The output has to be an integer, because we can't use a fraction of a bit - it's an atomic unit.

Luckily for us, some clever people came up with two handy mathematical tools called [logarithm](https://en.wikipedia.org/wiki/Logarithm) and [ceiling](https://en.wikipedia.org/wiki/Floor_and_ceiling_functions) functions. The former is responsible for doing the inverse exponentiation, whereas the latter rounds up the real result to the closest integer or leaves it alone if it's already an integer.

Putting it all together - given $n$ hosts, the minimum required size of a subnet's address pool can be determined using this formula: $$\large{2^{\lceil{\log_2(n+2)}\rceil}}$$

Applying it to our example subnets gives us the following:

- **P**: $2^{\lceil{\log_2(512)}\rceil}=2^9=512$
- **D**: $2^{\lceil{\log_2(258)}\rceil}=2^9=512$
- **H**: $2^{\lceil{\log_2(2027)}\rceil}=2^{11}=2048$
- **W**: $2^{\lceil{\log_2(875)}\rceil}=2^{10}=1024$

#### Sorting entries and subnetting base network

We can finally order the subnets, assign them appropriate address chunks, and learn their properties (i.e. network address, subnet mask, and broadcast address). It's clear that H and W will come on top by their respective sizes alone, with P and D being tied on 512 addresses each. Since we're meant to break ties by using the alphabetical sorting of names, D will be placed before P.

From here we can follow very similar steps to those we took with the base network when it comes to deriving the aforementioned properties. H network's address will be the same as the base one. The subnet mask can be obtained by subtracting the power we raised 2 to for the pool size from 32 and using the same formula I showed in [this section](#subnet-mask). Similarly, the [broadcast address](#broadcast-address) requires the same bitwise OR operation between the network address and the inverted subnet mask.

The next subnet's network address is essentially the previous one's broadcast address incremented by 1, and I'm pretty sure you get the idea from here, so let's skip ahead to the final result:

{% wide_container() %}
| Name | Pool size | Network address | Subnet mask   | Broadcast address |
| ---- | --------- | --------------- | ------------- | ----------------- |
| H    | 2048      | 2.137.64.0      | 255.255.248.0 | 2.137.71.255      |
| W    | 1024      | 2.137.72.0      | 255.255.252.0 | 2.137.75.255      |
| D    | 512       | 2.137.76.0      | 255.255.254.0 | 2.137.77.255      |
| P    | 512       | 2.137.78.0      | 255.255.254.0 | 2.137.79.255      |
{% end %}

## Implementing the algorithm

That's it for the theoretical part of this article. We can, at long last, write some code!

### Accepting and validating arguments

Let's say we want to pass the test parameters as command-line arguments in order:

1. base subnet's CIDR
2. comma-separated subnet list
3. tiebreak ordering (`"A->Z"` for alphabetical and `"Z->A"` for reverse alphabetical, because that's how they were denoted on my test).

We have to ensure that:

- we provide exactly 3 parameters
- the list is written as specified in the [test formula specification](#outlining-the-test-formula)
- the tiebreak ordering marker is one of the aforementioned strings

If any of the above checks fails, we should display an appropriate error message and exit the script with a failure status code (I'll keep using 1, but you can use different non-zero status codes if you're that keen on specifying the cause of failure).

Here's how we can code up this validation mechanism:

```python
import re
import sys

if len(sys.argv) != 4:
    print(
        "You have to provide exactly 3 arguments (in order):",
        "",
        "1. Base subnet's CIDR, eg. 123.45.67.89/10",
        '2. Comma-separated list of subnets with their name character and minimum number of hosts, eg. "(A,12), (B,34), (C,56)"',
        '3. "A->Z" to order subnets with the same pool sizes alphabetically, or "Z->A" to use reverse alphabetical order',
        sep="\n",
        file=sys.stderr
    )
    exit(1)
elif not re.fullmatch(r"^(?:\([A-Z],\d+\)(?:,\s*|$))+", sys.argv[2]):
    print("Invalid subnet list format!", file=sys.stderr)
    exit(1)
elif sys.argv[3] not in {"A->Z", "Z->A"}:
    print("Invalid order marker!", file=sys.stderr)
    exit(1)
```

Although checking the size of the list of arguments against 4 may seem like a typo at first glance, you have to keep in mind that the first string is the name of the script being executed, so _the actual parameters_ go from index 1 onwards.

As for the regex in the second if branch, it's not as complex or scary as it looks. Let's break it down:

- `^` (caret) matches the start of the string
- `(?:\([A-Z],\d+\)(?:,\s*|$))` marks a non-capturing group, where we match:
  - `\([A-Z],\d+\)` an opening bracket, an uppercase letter from A to Z, a comma, at least one digit, and a closing bracket
  - `(?:,\s*|$)` another non-capturing group with an alternative between:
    - `,\s*` a comma followed by zero or more whitespace characters
    - `$` the end of the string
- `+` matches at least one occurrence of the whole preceding group

> _But what about validating the CIDR notation?_

I hear you ask. Remember how I mentioned the cheesing potential of Python that makes this version of the script feel so overpowered? Check this out:

```python
from ipaddress import (
  AddressValueError,
  IPV4LENGTH,
  IPv4Network,
  NetmaskValueError
)

# ...

try:
    base_network = IPv4Network(sys.argv[1], strict=False)
except AddressValueError:
    print("Invalid base IPv4 address!", file=sys.stderr)
    exit(1)
except NetmaskValueError:
    print("Invalid base subnet mask bit count!", file=sys.stderr)
    exit(1)
```

As much as this `try/except` block doesn't appear unusual, I'd like to draw your attention to the `strict=False` keyword argument. It will tell the `IPv4Network` constructor to extract the network address from a CIDR that might contain an address of a host within the subnet instead of raising a `ValueError`. And of course, we also get the subnet mask and broadcast address calculated for us.

So, was bringing up all the bitwise shenanigans in vain? I don't think so, because you should now be able to port the script to a language that doesn't offer the convenience of a similar class.

> _OK, but what if there aren't enough addresses in the base pool for all the subnets?_

Good question, we'll come back to it shortly.

### Extracting and sorting subnet entries

Next up we have going through and sorting all the entries provided in the second parameter. Here's how I've tackled the implementation:

```python
from math import ceil, log2

# ...

subnet_entries = sorted(
    map(
        lambda match: (
            match.group(1),
            ceil(log2(int(match.group(2)) + 2))
        ),
        re.finditer(r"\(([A-Z]),(\d+)\)", sys.argv[2])
    ),
    key=lambda entry: (
        -entry[1],
        entry[0] * (1 if sys.argv[3] == "A->Z" else -1)
    )
)
```

We kick things off by extracting all the entries into an iterator using a portion of our previous regular expression, except this time we capture both the subnet's name and number of hosts to connect for easier access to each field's raw value.

We can get away with calculating the exponent without raising 2 to that power and sorting by the former alone, because we're dealing with the same base. The sorting key should be self-explanatory, so let's move on.

### Creating actual subnets for each entry

Before we get down to the subnetting business, we'll have to come back to the deferred question about validating the base pool size against the total subnets' pool sizes. For the former we can leverage the [formula I introduced earlier](#address-pool-size), whereas for the latter we can sum 2s raised to the recently calculated powers.

```python
base_addr_pool_size = 2 ** (IPV4LENGTH - base_network.prefixlen)
total_subnets_pool_size = sum(
    map(
        lambda entry: 2 ** entry[1],
        subnet_entries
    )
)

if total_subnets_pool_size > base_addr_pool_size:
    print(
        "The total size of provided subnets exceeds that of the base address pool!",
        file=sys.stderr
    )
    exit(1)
```

Alright, it's subnetting time:

```python
table_rows: list[tuple[str, int, str, str, str]] = []

for name, suffixlen in subnet_entries:
    prefixlen = IPV4LENGTH - suffixlen
    subnet = IPv4Network(
        (base_network.network_address, prefixlen)
    )

    table_rows.append(
        (
            name,
            2 ** suffixlen,
            str(subnet.network_address),
            str(subnet.netmask),
            str(subnet.broadcast_address)
        )
    )
    base_network = IPv4Network(
        int(subnet.broadcast_address) + 1
    )
```

Once again, no trace of bitwise black magic thanks to the power of the `IPv4Network` class. It's also worth pointing out that it's perfectly acceptable to pass an address to its constructor without specifying the subnet mask's bit count like in that `base_network` reassignment. In such case, the constructor implicitly sets it to 32.

### Pretty-printing results

And last but not least, we need to output the results of our little script. Let's start by printing the key details about our base network between the pool size check and the `table_rows` declaration:

```python
print(
    f"Base network address: {base_network.network_address}",
    f"Base subnet mask: {base_network.netmask}",
    f"Base broadcast address: {base_network.broadcast_address}",
    "",
    f"Base address pool size: {base_addr_pool_size:,}",
    f"Total subnets' pool size: {total_subnets_pool_size:,}",
    "",
    sep="\n"
)
```

To top it all off with a neat ASCII table after looping through `subnet_entries`:

```python
row_separator = f"+{"-"*6}+{"-"*15}+{"-"*17}+{"-"*17}+{"-"*19}+"

print(
    row_separator,
    "| Name |   Pool size   | Network address |   Subnet mask   | Broadcast address |",
    "=" * len(row_separator),
    sep="\n"
)

for row in table_rows:
    print(
        f"|{row[0]:<6}|{row[1]:<15,}|{row[2]:<17}|{row[3]:<17}|{row[4]:<19}|",
        row_separator,
        sep="\n"
    )
```

Moment of truth:

```text
> python subnets.py 2.137.69.4/20 "(P,510), (D,256), (H,2025), (W,873)" "A->Z"
Base network address: 2.137.64.0
Base subnet mask: 255.255.240.0
Base broadcast address: 2.137.79.255

Base address pool size: 4,096
Total subnets' pool size: 4,096

+------+---------------+-----------------+-----------------+-------------------+
| Name |   Pool size   | Network address |   Subnet mask   | Broadcast address |
================================================================================
|H     |2,048          |2.137.64.0       |255.255.248.0    |2.137.71.255       |
+------+---------------+-----------------+-----------------+-------------------+
|W     |1,024          |2.137.72.0       |255.255.252.0    |2.137.75.255       |
+------+---------------+-----------------+-----------------+-------------------+
|D     |512            |2.137.76.0       |255.255.254.0    |2.137.77.255       |
+------+---------------+-----------------+-----------------+-------------------+
|P     |512            |2.137.78.0       |255.255.254.0    |2.137.79.255       |
+------+---------------+-----------------+-----------------+-------------------+
```

Yay, we've done it!

## Complete source code

```python,linenos,name=subnets.py
# (c) 2025 Maciej Pędzich
# Released under the CC BY-SA 4.0 license:
# https://creativecommons.org/licenses/by-sa/4.0

from ipaddress import (
  AddressValueError,
  IPV4LENGTH,
  IPv4Network,
  NetmaskValueError
)
from math import ceil, log2
import re
import sys

if len(sys.argv) != 4:
    print(
        "You have to provide exactly 3 arguments (in order):",
        "",
        "1. Base subnet's CIDR, eg. 123.45.67.89/10",
        '2. Comma-separated list of subnets with their name character and minimum number of hosts, eg. "(A,12), (B,34), (C,56)"',
        '3. "A->Z" to order subnets with the same pool sizes alphabetically, or "Z->A" to use reverse alphabetical order',
        sep="\n",
        file=sys.stderr
    )
    exit(1)
elif not re.fullmatch(r"^(?:\([A-Z],\d+\)(?:,\s*|$))+", sys.argv[2]):
    print('Invalid subnet list format!', file=sys.stderr)
    exit(1)
elif sys.argv[3] not in {"A->Z", "Z->A"}:
    print('Invalid order marker!', file=sys.stderr)
    exit(1)

try:
    base_network = IPv4Network(sys.argv[1], strict=False)
except AddressValueError:
    print("Invalid base IPv4 address!", file=sys.stderr)
    exit(1)
except NetmaskValueError:
    print("Invalid base subnet mask bit count!", file=sys.stderr)
    exit(1)

subnet_entries = sorted(
    map(
        lambda match: (
            match.group(1),
            ceil(log2(int(match.group(2)) + 2))
        ),
        re.finditer(r"\(([A-Z]),(\d+)\)", sys.argv[2])
    ),
    key=lambda entry: (
        -entry[1],
        entry[0] * (1 if sys.argv[3] == "A->Z" else -1)
    )
)
base_addr_pool_size = 2 ** (IPV4LENGTH - base_network.prefixlen)
total_subnets_pool_size = sum(
    map(
        lambda entry: 2 ** entry[1],
        subnet_entries
    )
)

if total_subnets_pool_size > base_addr_pool_size:
    print(
        "The total size of provided subnets exceeds that of the base address pool!",
        file=sys.stderr
    )
    exit(1)

print(
    f"Base network address: {base_network.network_address}",
    f"Base subnet mask: {base_network.netmask}",
    f"Base broadcast address: {base_network.broadcast_address}",
    "",
    f"Base address pool size: {base_addr_pool_size:,}",
    f"Total subnets' pool size: {total_subnets_pool_size:,}",
    "",
    sep="\n"
)

table_rows: list[tuple[str, int, str, str, str]] = []

for name, suffixlen in subnet_entries:
    prefixlen = IPV4LENGTH - suffixlen
    subnet = IPv4Network(
        (base_network.network_address, prefixlen)
    )

    table_rows.append(
        (
            name,
            2 ** suffixlen,
            str(subnet.network_address),
            str(subnet.netmask),
            str(subnet.broadcast_address)
        )
    )
    base_network = IPv4Network(
        int(subnet.broadcast_address) + 1
    )

row_separator = f"+{"-"*6}+{"-"*15}+{"-"*17}+{"-"*17}+{"-"*19}+"

print(
    row_separator,
    "| Name |   Pool size   | Network address |   Subnet mask   | Broadcast address |",
    "=" * len(row_separator),
    sep="\n"
)

for row in table_rows:
    print(
        f"|{row[0]:<6}|{row[1]:<15,}|{row[2]:<17}|{row[3]:<17}|{row[4]:<19}|",
        row_separator,
        sep="\n"
    )
```