Fabric Configuration
Overview
The fab.yaml file is the configuration file for the fabric. It supplies
the configuration of the users, their credentials, logging, telemetry, and
other non wiring related settings. The fab.yaml file is composed of multiple
YAML objects inside of a single file. Per the YAML spec 3 hyphens (---) on
a single line separate the end of one object from the beginning of the next.
There are two YAML objects in the fab.yaml file. For more information about
how to use hhfab init, run hhfab init --help.
HHFAB workflow
After hhfab has been downloaded:
hhfab init(see different flags to customize initial configuration)- Adjust the
fab.yamlfile to your needs - Build your wiring diagram
hhfab validate- (optionally)
hhfab diagram hhfab build
Or import existing fab.yaml and wiring files:
hhfab init -c fab.yaml -w wiring-file.yaml -w extra-wiring-file.yamlhhfab validate- Build your wiring diagram
- (optionally)
hhfab diagram hhfab build
After the above workflow a user will have a .img file suitable for installing the control node, then bringing up the switches which comprise the fabric.
Complete Example File
The following example outlines a comprehensive Fabricator configuration. You can find further configuration details in the Fabricator API Reference.
Configure Control Node and Switch Users
Control Node Users
Configuring control node and switch users is done either passing
--default-password-hash to hhfab init or editing the resulting fab.yaml
file emitted by hhfab init. The default username on the control node is
core.
Switch Users
There are two users on the switches, admin and operator. The operator user has
read-only access to sonic-cli command on the switches. The admin user has
broad administrative power on the switch.
To avoid conflicts, do not use the following usernames: operator,hhagent,netops.
NTP and DHCP
The control node uses public NTP servers from Cloudflare and Google by default. The control node runs a DHCP server on the management network. See the example file.
Control Node
The control node is the host that manages all the switches, runs k3s, and serves images. The management interface is for the control node to manage the fabric switches, not end-user management of the control node. For end-user management of the control node specify the external interface name.
Telemetry
There is an option to enable Grafana
Alloy on all switches to forward metrics and logs to the configured targets using
Prometheus Remote-Write
API and Loki API. Metrics includes port speeds, counters,
errors, operational status, transceivers, fans, power supplies, temperature
sensors, BGP neighbors, LLDP neighbors, and more. Logs include Hedgehog agent
logs. Modify the URL as needed, instead of /api/v1/push it could be
/api/v1/write; check the documentation for the data provider.
Switches push telemetry data through a proxy running in a pod on the control node. Switches do not have direct access to the Internet. Configure the control node to be able to reach and resolve the location of the Prometheus and Loki servers.
Telemetry can be enabled after installation of the fabric. There are two YAML objects that control the telemetry configuration. The first YAML object configures the credentials and URL for the collectors. The second configures which metrics are sent via Grafana Alloy.
Credentials
The first object provides the URL and credentials for sending the telemetry. This can be obtained from the Grafana cloud dashboard by selecting details on the desired stack, then details again on the collector, Prometheus. Be sure to choose the URL for "Remote Write". Use the YAML listing below as a template and fill in your, username, token/password, and URL.
- Can be any name of your choosing
- Change to match your environment
- Can be any name of your choosing
To apply these changes to the fabric use:
Collecting and Pushing
The second YAML object controls which metrics are sent from the fabric to the collectors. By default the full list of telemetry is sent from the fabric to Prometheus and Loki. In the example the metrics are restricted to those matching the regular expression, everything else is discarded.
- The Hedgehog agent generates information from the ASIC ports and switch configuration
- This option mirrors the prometheus.relabel component
- Alloy is configured to use the prometheus.exporter.unix component
- This option mirrors the prometheus.relabel component
Users are encouraged to read the Grafana Alloy Docs on relabeling to ensure the desired metrics are selected. By default all metrics are sent to the collectors.
As above, to apply these changes to the fabric use the following command: