Skip to main content

Command Palette

Search for a command to run...

NexusProbe v0.3.0: From Port Checker to Service Discovery Engine

Updated
NexusProbe v0.3.0: From Port Checker to Service Discovery Engine

Bhai, v0.2.0 sorted out the biggest problem we had — speed. ThreadPoolExecutor made the scanner fast enough to actually use, and CLI args via argparse meant we weren't hardcoding the target IP every single time. But once that was working, the limitation became obvious fast: we knew a port was open, but we had no idea what was running on it.

An open port without context is useless for actual recon. Port 8080 open — okay, but is that a Jenkins instance, a proxy, a dev server? Port 22 open — what SSH version, what software? That's the gap v0.3.0 closes.

So this version isn't about going faster (we're already fast). It's about going smarter — adding active service detection via banner grabbing, giving the user control over thread count and timeout, adding a live progress bar so the scan doesn't feel like a black box, and letting results actually be saved to disk instead of just flying past in the terminal.

WHAT'S NEW IN v0.3.0

  • Service Detection — banner grabbing for HTTP, SSH, FTP, DNS and a generic fallback for everything else

  • Configurable concurrency-w/--workers to control thread pool size instead of relying on the default

  • Configurable timeout-T/--timeout for the socket

  • Export support-f/--format (json/csv) + -o/--output to dump results to a file

  • Live progress bartqdm wired into the thread pool so you can actually see the scan progressing

  • Structured resultsscan_port now returns a dict ({"port": ..., "status": ..., "service": ...}) instead of just printing straight to stdout

ARCHITECTURE CHANGES

The folder structure grew by two files:

├── main.py
├── nexusprobe
│   ├── cli.py        # CLI argument parser (now with -f, -o, -w, -T)
│   ├── engine.py      # Threadpool logic + tqdm + export wiring
│   ├── enums.py       # NEW: output format enum (json/csv)
│   ├── export.py      # NEW: csv/json writers
│   ├── __init__.py    # Metadata & exports
│   └── scanner.py     # Scanner logic + service detection
└── README.md

enums.py and export.py are new. Splitting export logic into its own file instead of stuffing it into engine.py keeps the "scan" responsibility and the "write results to disk" responsibility separate — if export breaks tomorrow (say I want to add XML), I know exactly where to go without touching scan logic at all.

DIVING TO CODE

nexusprobe/cli.py

import argparse
from .enums import outputformat
from . import __version__


def receive_args():
    parser = argparse.ArgumentParser(
        prog="nexusprobepy", description="Nexus ProbePy CLI", epilog="nexusprobepy"
    )

    parser.add_argument("-t", "--target", help="Target IP address or hostname")
    parser.add_argument("-p", "--port", help="Target port number")
    parser.add_argument("-a", "--all", action="store_true", help="Scan all ports")
    parser.add_argument("-r", "--range", help="Port range to scan")

    parser.add_argument("-f", "--format", type=str, choices=[f.value for f in outputformat], help="Output in json or csv")
    parser.add_argument("-o", "--output", help="Output file path")

    parser.add_argument('-w', '--workers', type=int, default=100, help="No of Threads to use")
    parser.add_argument('-T', '--timeout', type=float, help="set time out for the scanner")

    parser.add_argument("-v", "--version", action="version", version=f"%(prog)s {__version__}")
    args = parser.parse_args()

    if not args.target:
        parser.error("Target is required")

    if args.format and not args.output:
        parser.error("Output file path is required when using format")

    return args

Four new flags here, and each one solves a real problem we hit while using v0.2.0:

  1. -f/--format and -o/--output — In v0.2.0, every scan result was gone the second it scrolled off your terminal. If you wanted to keep a record for a report, you were stuck screenshotting or copy-pasting. Now you can pass -f json -o results.json and get a structured file you can actually feed into another tool later. Notice choices=[f.value for f in outputformat] — this pulls valid values directly from the outputformat enum instead of hardcoding ["json", "csv"] as a string list. If I add a new format later, I add it once in enums.py and the CLI validation picks it up automatically.

  2. Validation logic (if args.format and not args.output) — this is a small thing but it matters: exporting to a format without telling the program where to put it makes no sense, so we fail fast with a clear error instead of silently doing nothing or crashing deep inside export.py.

  3. -w/--workers — v0.2.0 just called ThreadPoolExecutor() with no max_workers, which means Python picks a default based on CPU count. That's fine for casual use, but if you're scanning all 65536 ports on a fast network, you want control. default=100 gives a sane out-of-the-box value while letting power users crank it up or down.

  4. -T/--timeout — previously the socket timeout (1.5 seconds) was hardcoded inside scanner.py. Hardcoded timeouts are a problem because every network is different — scanning over a slow VPN needs a longer timeout, scanning on local LAN can be much shorter for speed. Exposing it as a flag means the user decides instead of me guessing.

nexusprobe/enums.py

from enum import Enum, auto

class outputformat(Enum):
    JSON = "json"
    CSV = "csv"

    def __str__(self):
        return self.value

Small file, but it's doing an important job: instead of comparing raw strings (if args.format == "json") scattered across the codebase, we get one source of truth for valid output formats. __str__ returning self.value means when this enum gets printed (like in argparse's choices help text) it shows json/csv instead of the ugly outputformat.JSON repr.

nexusprobe/export.py

import csv
import json

def export_to_csv(filename, data):
    """
    export scan output to csv file
    """
    print(f"Exporting to CSV: {filename} ({len(data)} rows) \n")
    with open(filename, 'w') as f:
        writer = csv.DictWriter(f, fieldnames=data[0].keys())
        writer.writeheader()
        writer.writerows(data)
    print(f"Exported to CSV: {filename}")

def export_to_json(filename, data):
    """
    export scan output to json file
    """
    with open(filename, 'w') as f:
        json.dump(data, f, indent=4)
    print(f"Exported to JSON: {filename}")

Straightforward — export_to_csv uses csv.DictWriter, which means it derives the column headers directly from the keys of the first result dict (port, status, service). That's only possible because scan_port now returns structured dicts instead of just printing — the export logic depends entirely on that earlier change in scanner.py. export_to_json is even simpler since json.dump handles a list of dicts natively.

One thing worth flagging for myself before v0.4.0: data[0].keys() in export_to_csv will throw an IndexError if data is empty — e.g. if every port scanned was closed. Need a guard there.

nexusprobe/scanner.py

This is where the real upgrade happened. v0.2.0's scan_port just told you open/filtered/closed. v0.3.0 adds an entire get_service_detection function for banner grabbing:

def get_service_detection(sock, port, timeout=1.5):
    """
    Active Service Detection
    """
    try:
        sock.settimeout(timeout)

        if port in [80, 8080, 443]:
            try:
                sock.send(b"GET / HTTP/1.1\r\nHost:test \r\n\r\n")
                response = sock.recv(1024).decode(errors='ignore').strip()

                if "Server:" in response:
                    for line in response.split("\r\n"):
                        if line.startswith("Server:"):
                            return line.split("Server:")[1].strip()
                    return "HTTP Service"
            except:
                return "HTTP SERVICE (NO RESPONSE)"

        if port in [22, 2222]:
            banner = sock.recv(1024).decode(errors='ignore').strip()
            return banner if banner else "SSH Service"

        if port in [21, 2121]:
            banner = sock.recv(1024).decode(errors='ignore').strip()
            if "ready" in banner.lower() or 'welcome' in banner.lower():
                sock.send(b"FEAT\r\n")
                syst_response = sock.recv(1024).decode(errors="ignore").strip()
                security = " (Supports TLS/SSL)" if "AUTH TLS" in syst_response else ""
                if "ja-JP" in syst_response and "zh-CN" in syst_response:
                    return f"FTP Service [FileZilla Server]{security}"
            return "FTP Service"

        if port in [53, 5353]:
            sock.send(b"\x00\x06\x01\x00\x00\x01\x00\x00\x00\x00\x00\x00\x07version\x04bind\x00\x00\x10\x00\x03")
            res = sock.recv(1024).decode(errors='ignore').strip()
            return "DNS Service"

        else:
            banner = sock.recv(1024).decode(errors='ignore').strip()
            if banner:
                return f"FTP {banner}" if banner.startswith("220") else banner
    except:
        return "Unknown Service"

The logic here is protocol-aware, not just "throw bytes and see what comes back." Each branch speaks the actual protocol the port is expected to run:

  • HTTP (80/8080/443) — sends a real GET / HTTP/1.1 request and parses the Server: header out of the response. This is the same trick curl -I uses to fingerprint web servers (nginx, Apache, etc.) without needing a full HTTP client library.

  • SSH (22/2222) — SSH servers send their banner (SSH-2.0-OpenSSH_x.x) immediately on connect, no request needed, so we just recv and read it straight off the wire.

  • FTP (21/2121) — checks for a "ready"/"welcome" banner, then sends the FEAT command (FTP's "what features do you support" command) to detect things like TLS support, and even fingerprints FileZilla Server specifically by checking for ja-JP/zh-CN locale strings in the FEAT response — a small but legit OSINT trick.

  • DNS (53/5353) — sends a raw DNS query for version.bind under the CHAOS class, which is the standard way to ask a BIND-based DNS server to reveal its version string.

  • Fallback (else) — for anything else, just grab whatever banner the service sends unprompted, and detect FTP-style 220 response codes as a bonus.

This is exactly how tools like nmap -sV work under the hood at a basic level — protocol-specific probes instead of one generic "send and pray" approach.

scan_port itself changed too — from printing tables directly to returning data:

def scan_port(task):
    host, port = task
    try:
        sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        sock.settimeout(1.5)
        result = sock.connect_ex((host, int(port)))

        if result == 0:
            service = get_service_detection(sock, port)
            return {"port": port, "status": "open", "service": service}
            sock.close()
        elif result in (errno.EAGAIN, errno.EALREADY, 115):
            sock.close()
        elif result == errno.ECONNREFUSED:
            sock.close()
    except KeyboardInterrupt:
        print("KeyboardInterrupt from user")
        exit(1)
    except socket.gaierror as e:
        print(f"Error: {e}")
        exit(1)
    except socket.timeout:
        exit(1)

This is the most important architectural shift in this version: v0.2.0 printed inside the worker thread, v0.3.0 returns a value from it. Printing directly from worker threads is messy — with 100 threads firing concurrently you can get garbled/interleaved output. Returning structured data instead means the engine collects everything in order and decides what to do with it (print, export, both) — separation of concerns between "what did the scan find" and "what do we do with what we found."

Small bug worth noting honestly: return {...} followed by sock.close() on the next line means that sock.close() is dead code — it never executes since return exits the function immediately. Socket cleanup here is currently relying on garbage collection rather than an explicit close. Something to fix in the next pass — probably by closing the socket before returning.

nexusprobe/engine.py

import concurrent.futures
from socket import timeout
import time
from .export import export_to_csv, export_to_json
from .cli import receive_args
from .scanner import is_online, scan_port
from tqdm import tqdm


def run_engine():
    args = receive_args()
    outputs = []

    try:
        if not is_online(args.target):
            print(f"[ - ] Host {args.target} is not online")
            exit(0)

        if args.port:
            portargs = args.port.split(",")
            tasks = [(args.target, port) for port in portargs]

        elif args.range:
            start_port = int(args.range.split("-")[0])
            end_port = int(args.range.split("-")[1])
            if start_port > end_port:
                print(f"[ - ] Invalid range: {args.range}")
                return
            ports_range = range(start_port, end_port + 1)
            tasks = [(args.target, int(p)) for p in ports_range]

        elif args.all:
            ports_range = range(1, 65536)
            tasks = [(args.target, p) for p in ports_range]

        else:
            ports_range = [21, 22, 23, 25, 53, 80, 110, 139, 143, 443, 445, 465, 587,
                            993, 995, 1433, 3306, 3389, 5432, 6379, 8080, 8443, 5000,
                            5001, 8088, 17000, 17001, 25565, 27015]
            tasks = [(args.target, p, args.timeout) for p in ports_range]

        with concurrent.futures.ThreadPoolExecutor(max_workers=args.workers) as executor:
            futures = [executor.submit(scan_port, task) for task in tasks]

            for future in tqdm(concurrent.futures.as_completed(futures), total=len(tasks), unit="ports"):
                result = future.result()
                if result:
                    outputs.append(result)

        if args.format and args.output:
            if args.format.lower() == "csv":
                export_to_csv(filename=args.output, data=outputs)
            elif args.format.lower() == "json":
                export_to_json(filename=args.output, data=outputs)
            else:
                print(f"Unsupported format: {args.format}")

    except Exception as e:
        print(f"Expection occured : {str(e)} ")

Three real upgrades here:

  1. max_workers=args.workers — the thread pool size is no longer left to Python's default; it's tied directly to the -w flag we added in cli.py.

  2. executor.submit + as_completed instead of executor.map — this is a meaningful shift, not just cosmetic. executor.map returns results in the order tasks were submitted, and you can't easily wrap it with a useful progress bar because it blocks until each result in sequence is ready. executor.submit() kicks off every task immediately and returns a Future for each one, and concurrent.futures.as_completed() yields each future the moment it finishes — regardless of order. That's exactly what tqdm needs to draw an accurate live progress bar, because the bar advances in real time as scans actually complete, not in submission order.

  3. Collecting outputs — every result from future.result() gets appended to a list, which is what eventually gets handed off to export_to_csv/export_to_json.

Honest bug to flag here too: in the else branch (top-ports default scan), tasks are built as 3-tuples — (args.target, p, args.timeout) — but scan_port still does host, port = task, expecting exactly two values. That'll raise a ValueError: too many values to unpack the moment someone runs the scanner with no flags at all (the most common, default use case). The other three branches (-p, -r, -a) correctly build 2-tuples. This is the kind of bug that's easy to miss because it only shows up in the default path, and is top of the list for the v0.3.1 patch.

nexusprobe/__init__.py

__version__ = "0.3.0"
__author__ = "codedloki"

from .cli import receive_args
from .engine import run_engine
from .scanner import scan_port
from .enums import outputformat
from .export import export_to_csv, export_to_json

Same pattern as v0.2.0 — package-level re-exports so from nexusprobe import run_engine works without digging into submodules. Two new exports added for the enum and export functions, keeping the public package surface in sync with the new features.

USAGE

# Default top-ports scan
python3 main.py -t 192.168.1.40

# Custom port range with 200 threads
python3 main.py -t 192.168.1.40 -r 1-1000 -w 200

# Scan all ports, export results to JSON
python3 main.py -t 192.168.1.40 -a -f json -o results.json

# Custom timeout for a slow/remote target
python3 main.py -t 192.168.1.40 -p 22,80,443 -T 3

CONCLUSION

v0.2.0 made NexusProbe fast. v0.3.0 makes it useful — open ports now come with context (what's actually running), scans give live feedback instead of a silent wait, and results can be exported and reused instead of disappearing into the terminal scrollback.

It's still rough around the edges — the 2-tuple/3-tuple mismatch in the default scan path and the dead sock.close() after an early return are both real bugs sitting in this version, and export_to_csv will blow up on an empty result set. Calling that out here instead of pretending v0.3.0 is polished, because that's the whole point of these build logs — documenting the thing as it actually is, not the cleaned-up version.

Next up: fixing those bugs properly, expanding service detection to more ports (RDP, SMB, databases), and probably looking at async I/O (asyncio) as an alternative to threads for scaling past a few thousand concurrent connections.

Source Code: v0.3.0

Dev Logs: Building Tools

Part 3 of 3

Behind the scenes of my coding journey. From Python scripts to Rust-based system tools like Nexus Probe. Documenting architectural decisions, debugging sessions, and the evolution of my projects

Start from the beginning

NexusProbepy v0.1.0 Building PortScanner from Scratch

Why Python for Reconnaissance? I am starting a series to build security tools from scratch . Here is the First up : Nexus Probe Python is the swiss army knife for security experts and hackers lightwei