Skip to content
RedLiner Portal
  • About
    • Leadership
    • Team
    • RedLine Performance Methodology
  • Expertise
    • Scientific Programming & Analysis
    • Enterprise IT Technical Infrastructure
    • HPC Systems Deployment / Management
    • HPC Storage & Networking
    • Cloud Computing
    • Mission Support
  • Contracts
  • Case Studies
  • News
  • Careers
  • Contact
  • About
    • Leadership
    • Team
    • RedLine Performance Methodology
  • Expertise
    • Scientific Programming & Analysis
    • Enterprise IT Technical Infrastructure
    • HPC Systems Deployment / Management
    • HPC Storage & Networking
    • Cloud Computing
    • Mission Support
  • Contracts
  • Case Studies
  • News
  • Careers
  • Contact

News & Blogs

Get Started
red arc

NetBox as a Living Source of Truth: How We Manage HPC Lab Infrastructure

  • Industry Trends, Operations & Maintenance
  • May 13, 2026
  • Matthew Kriegh

Rethinking Infrastructure Management in High-Performance Computing

If you’ve ever relied on spreadsheets, scattered documentation, or tribal knowledge to track what’s in your data center, you already know the problem. In high performance computing (HPC) environments where dense racks, multi-node chassis, mixed networking, and constant repurposing are the norm, that approach doesn’t just fall short it becomes a liability.

That’s why we adopted NetBox, the open-source data center infrastructure management (DCIM) platform, as the authoritative source of truth for our RedLine lab. But we didn’t stop at adoption. We turned NetBox into a living, self-maintaining model of our entire infrastructure and one that engineers and admins can actually trust.

Why NetBox? Because Automation Demands Accuracy

NetBox isn’t just an inventory tool. It’s a structured database built around real data center concepts: racks, devices, interfaces, cables, VLANs, IP addresses, and more. Every physical and logical object in our environment is modeled, normalized, and queryable from a single pane of glass.

For our HPC lab, the value proposition is clear: automation depends on correct data. When provisioning scripts, monitoring systems, and configuration pipelines all consume infrastructure data, that data must be accurate. Not “accurate as of last Tuesday.” We need real-time fidelity.

NetBox maps perfectly to our environment. Racks, chassis, blades, NICs, switch ports, VMs — all of it is consistently represented. We always know what hardware we have, where it’s physically located, and what it’s connected to. 

Treating NetBox as a Living Data Set

The real differentiator isn’t NetBox itself but how we use it.  We don’t manually maintain our instance. Instead, we’ve built custom sync pipelines that continuously enrich NetBox with live data pulled from multiple sources:

  • IPMI/iDRAC — for hardware facts like model, serial number, and MAC addresses
  • Proxmox — for VM inventory, resource utilization, and cluster state
  • pfSense — for DHCP leases and firewall-side IP assignments
  • FreeIPA — for DNS names and domain integration
  • Switch MAC address tables — for physical port-to-device correlation

The result? Cables, MACs, VLANs, and IPs are automatically validated and corrected. When someone moves a cable, renames a device, or spins up a VM, NetBox knows about it and not because someone updated a spreadsheet, but because the automation pipeline detected the change.

This eliminates manual entry errors, reduces configuration drift, and gives every engineer full clarity into lab operations.

From Raw Data to Structured Intelligence

One of the most powerful components of our workflow is the custom scripting layer built on pynetbox, the official Python API client for NetBox. Using pynetbox alongside NetBox’s full REST and GraphQL APIs, we can programmatically create devices, assign interfaces, populate IPs, and build cable records — all from a single command.

Consider our IPMI importer script as an example. It connects to a device’s out-of-band management interface, authenticates, and pulls key hardware facts. It then correlates those facts against switch MAC tables to determine exactly which switch port a server is connected to. From there, it automatically creates the cable object, assigns VLANs, populates interface records, and updates rack position metadata.

The goal: onboard a server with a single command instead of dozens of manual entries.

Extending NetBox with Custom Plug-ins

Out of the box, NetBox covers core DCIM and IPAM use cases. But HPC labs have unique operational needs. That’s why we built a custom Projects plug-in that lets us assign physical resources — compute nodes, storage, networking — to specific projects and teams.

Each project tracks ownership, team leads, and resource allocation. Engineers can immediately see which resources are in use, which are available, and how to plan for new workloads. This eliminates resource conflicts, provides accountability, and gives leadership a clear operational view of lab utilization.

The Bigger Picture

NetBox has fundamentally changed how we operate. Instead of maintaining static documentation that’s outdated the moment it’s written, we maintain a dynamic, API-driven model that reflects reality at all times. Infrastructure becomes queryable, auditable, and automatable.

For any team managing complex, high-density environments, whether HPC, cloud labs, or enterprise data centers, the takeaway is simple: your infrastructure data is only as valuable as it is accurate, and accuracy at scale requires automation.

NetBox, combined with the right integrations and a commitment to treating infrastructure as code, delivers exactly that.

More Posts

Boosting Your Productivity with GitHub Copilot: A Developer’s Guide

January 21, 2026

Warewulf: Supercharging High-Performance Computing Clusters

September 23, 2025

Enhancing Continuous Integration Practices at NOAA EMC

July 24, 2025

Using Spack to Streamline Software Development, Testing, and Deployment

May 5, 2025

Streamlining HPC Workflows with ecFlow: A Game-Changer for Operational Efficiency

March 10, 2025
Categories
Archives
Author
Picture of Matthew Kriegh
Matthew Kriegh
All Posts
PrevPreviousBoosting Your Productivity with GitHub Copilot: A Developer’s Guide
red arc
RedLine Performance Solutions logo

Stay Connected

301-685-5949
webinfo@redlineperf.com
Connect on LinkedIn
RedLiner Portal

Services

  • Scientific Programming & Analysis
  • Enterprise IT Technical Infrastructure
  • HPC Systems Deployment / Management
  • HPC Storage & Networking
  • Cloud Computing
  • Mission Support
  • Scientific Programming & Analysis
  • Enterprise IT Technical Infrastructure
  • HPC Systems Deployment / Management
  • HPC Storage & Networking
  • Cloud Computing
  • Mission Support

© 2026 REDLINE | PRIVACY POLICY | WEBSITE BY: SASSE AGENCY