Thursday, May 23, 2019

Programming a Netronome network card (inside a VM) from command line

This is the same card(s) we got to work inside a vm guest using the magic of PCI passthrough. Netronome wants us to use a Windows-only IDE to do development work in it while the card is placed in a Linux box we can reach; some of its features remind me of the IDE Google has for Androids, which allows you to run an emulator and do some real time debugging. The difference is the Google one works in Linux, Windows, and Mac, and it only requires one computer (which could be remotely accessed).

Do we really need to use the Netronome SDK? I guess it depends on what we want to do. For now, let's see if we can get something running using command line only to the point we can compile in Micro-C and run something in the card.


Get the packages

  1. First we need a few packages available for either CentOS or Ubuntu.
    Note: The Netronome Linux SDK officially only support CentOS and Ubuntu. So we will only be covering those distros.
    • Ubuntu:
      apt-get install libftdi1 libjansson4 build-essential \
       linux-headers-`uname -r` dkms git
    • CentOS: (Still using yum; will do a dnf version when I feel like.
      yum -y install epel-release && yum update -y
      yum -y install libftdi jansson pciutils kernel-devel dkms wget git

    Netronome does require you to have an account to get the SDK packages. I can't help with that; what I can tell you is that once I got the account I downloaded everything which was available at the time I wrote this article:

    raub@desktop:~$ ls Downloads/netronome/
    SDK
    agilio-nfp-driver-dkms-2018.01.11.2333.f40482a-1.el7.noarch.rpm
    agilio-nfp-driver-dkms_2018.01.11.2333.f40482a_all.deb
    firmware
    readme
    raub@desktop:~$ ls Downloads/netronome/SDK/
    6.0.4.1 6.1.0.1
    raub@desktop:~$ ls Downloads/netronome/SDK/6.1.0.1/
    nfp-sdk-6.1.0.1-preview-3286-setup.exe
    nfp-sdk-6.1.0.1_preview-0-3243.x86_64.rpm
    nfp-sdk-p4-rte-6.1.0.1-preview-3202.centos.x86_64.tar
    nfp-sdk-p4-rte-6.1.0.1-preview-3214.ubuntu.x86_64.tar
    nfp-sdk-sim-6.1.0.0-preview-3179.x86_64.tar
    nfp-sdk_6.1.0.1-preview-3243-2_amd64.deb
    nfp-toolchain-6.1.0.1-preview-3243.x86_64.tar
    raub@desktop:~$

    and then copied them all to the development vm guest we created in the previous article, desktop1

    What are those files? I put what I have gathered about their function in the readme file (it covers the old file version but it should get an idea):
    Programmer Studio IDE
    nfp-sdk-6.0.4.1-3276-setup.exe - Windows
    
    Run Time Environment (RTE)
    nfp-sdk-p4-rte-6.0.4.1-3195.ubuntu.x86_64.tgz
    nfp-sdk-p4-rte-6.0.4.1-3191.centos.x86_64.tgz
    
    Hosted Toolchain (to be used with BSP and SmartNIC)
    nfp-sdk_6.0.4.1-3227-2_amd64.deb
    nfp-sdk-6.0.4.1-0-3227.x86_64.rpm
    
    NFP Simulator
    nfp-sdk-sim-6.0.4.1-3177.x86_64.tgz
    
    Hosted Toolchain (to be used with NFP Simulator)
    nfp-toolchain-6.0.4.1-3227.x86_64.tgz
    Note: There are two versions of the SDK. Just pick the latest.
  2. Then install the basic SDK
    • Ubuntu:
      sudo dpkg -i nfp-sdk_6.1.0.1-preview-3243-2_amd64.deb
    • CentOS:
      sudo rpm -ivh nfp-sdk-6.1.0.1_preview-0-3243.x86_64.rpm

    This creates a /opt/netronome directory.

  3. And add to the path where the binaries will be installed.
    cat >> ~/.bash_profile << 'EOF'
    
    # Netronome SDK
    PATH=$PATH:/opt/netronome/bin
    export PATH
    EOF
    source ~/.bash_profile
    Note: If the user you are building your code on does not have rights to write to the card, you should edit the root's .bash_profile file as well.
  4. We do need the Netronome modified but open source nfp driver which has the development features (specificially, it has a nfp_dev_cpp option we will need to expose the low-level user space access ABIs of non-netdev mode) we need. So, we install it, which requires installing the Netronome repo:

    • Ubuntu:
      wget https://deb.netronome.com/gpg/NetronomePublic.key
      apt-key add NetronomePublic.key
      add-apt-repository "deb https://deb.netronome.com/apt stable main"
      apt-get update
      apt-get install agilio-nfp-driver-dkms
    • CentOS:
      wget https://rpm.netronome.com/gpg/NetronomePublic.key
      rpm --import NetronomePublic.key
      cat << 'EOF' > /etc/yum.repos.d/netronome.repo
      [netronome]
      name=netronome
      baseurl=https://rpm.netronome.com/repos/centos/
      gpgcheck=0
      enabled=1
      EOF
      yum makecache
      yum install -y agilio-nfp-driver-dkms --nogpgcheck
    and then reboot.
  5. Now we can install the RTE
    • Ubuntu:
      tar xvf nfp-sdk-p4-rte-6.1.0.1-preview-3214.ubuntu.x86_64.tar
      cd nfp-sdk-6-rte-v6.1.0.1-preview-Ubuntu-Release-r2750-2018-10-10-ubuntu.binary/
      sudo ./sdk6_rte_install.sh install
    • CentOS:
      tar xvf nfp-sdk-p4-rte-6.1.0.1-preview-3202.centos.x86_64.tar
      cd nfp-sdk-6-rte-v6.1.0.1-preview-CentOS-Release-r2749-2018-10-09-centos.binary/
      sudo ./sdk6_rte_install.sh install
    This should cause /opt/netronome/bin/ to fill with many more files; this is a good way to check progress.
    NOTE:Chances are It will get pissed:
    [...]
    Loaded plugins: fastestmirror
    Examining /home/centos/netronome/SDK/6.1.0.1/nfp-sdk-6-rte-v6.1.0.1-preview-CentOS-Release-r2749-2018-10-09-centos.binary/dependencies/nfp-bsp/rpm//nfp-bsp-dkms_2018.08.17.1104_all.rpm: nfp-bsp-dkms-2018.08.17.1104-1dkms.noarch
    Marking /home/centos/netronome/SDK/6.1.0.1/nfp-sdk-6-rte-v6.1.0.1-preview-CentOS-Release-r2749-2018-10-09-centos.binary/dependencies/nfp-bsp/rpm//nfp-bsp-dkms_2018.08.17.1104_all.rpm to be installed
    Resolving Dependencies
    --> Running transaction check
    ---> Package nfp-bsp-dkms.noarch 0:2018.08.17.1104-1dkms will be installed
    --> Processing Conflict: agilio-nfp-driver-dkms-2019.04.02.0225.bf81349-1.el7.noarch conflicts nfp-bsp-dkms
    Loading mirror speeds from cached hostfile
     * base: packages.oit.ncsu.edu
     * epel: mirror.umd.edu
     * extras: packages.oit.ncsu.edu
     * updates: packages.oit.ncsu.edu
    No package matched to upgrade: nfp-bsp-dkms
    --> Finished Dependency Resolution
    Error: agilio-nfp-driver-dkms conflicts with nfp-bsp-dkms-2018.08.17.1104-1dkms.noarch
     You could try using --skip-broken to work around the problem
     You could try running: rpm -Va --nofiles --nodigest
    Error! There are no instances of module: nfp-bsp-dkms
    located in the DKMS tree.
    [centos@desktop1 nfp-sdk-6-rte-v6.1.0.1-preview-CentOS-Release-r2749-2018-10-09-centos.binary]$
    but it will get over and will work fine.
  6. Ensure that nfp_dev_cpp = 1
    theuser@desktop1:~$ cat /sys/module/nfp/parameters/nfp_dev_cpp
    1
    theuser@desktop1:~$ 

    If not, say, yet get an error message like this

    [theuser@desktop1 ~]# cat /sys/module/nfp/parameters/nfp_dev_cpp
    cat: /sys/module/nfp/parameters/nfp_dev_cpp: No such file or directory
    [theuser@desktop1 ~]#

    uninstall nfp and install it back with the option set. There are ways to load said option at boot time; I will leave that as an exercise to the reader.

    theuser@desktop1:~$ sudo modprobe -r -v nfp && sudo modprobe nfp nfp_dev_cpp=1
    theuser@desktop1:~$
  7. Ensure nfp-hwinfo is talking to the card. The expected outcome should look like this:
    theuser@desktop1:~$ sudo /opt/netronome/bin/nfp-hwinfo
    nfp.interface=pci.0.0
    nfp.model=0x40010010
    nfp.serial=00:15:4d:13:5d:2b
    board.exec=bootloader.bin
    uart.baud=115200
    preinit.setup.version=nfp-bsp-6000-b0 (4ef1e19ba176)
    pcie0.type=ep
    assembly.revision=11
    assembly.model=lithium
    assembly.partno=AMDA0096-0001
    assembly.serial=17290647
    assembly.vendor=SMC
    ddr0.spd=spi:1:0:0x3F0F00
    ddr1.spd=spi:1:0:0x3F0F00
    ddr2.spd=none
    ddr3.spd=none
    ddr4.spd=none
    ddr5.spd=none
    emu1.type=cache
    emu2.type=cache
    ethm.mac=00:15:4d:13:5d:2b
    eth.mac=00:15:4d:13:5d:2c
    eth.macs=2
    vpd=fis:1:0:vpd.bin
    board.setup.version=nfp-bsp-6000-b0 (4ef1e19ba176)
    chip.model=NFP4001
    chip.revision=B0
    core.speed=633
    me.speed=633
    arm.speed=475
    chip.model.device=0x62006c20
    chip.identifier=0x219b8546c
    chip.model.hard=0x5
    chip.model.soft=0x40010096
    chip.route=0xc96f1e8e
    chip.island=0x1001f13000112
    mem.setup.version=nfp-bsp-6000-b0 (4ef1e19ba176)
    ddr0.mem.size=1024
    ddr1.mem.size=1024
    ddr0.mem.speed=1600
    ddr1.mem.speed=1600
    emu0.mem.size=2048
    emu0.mem.base=0x2000000000
    emu1.mem.size=3
    [...]
    theuser@desktop1:~$

    If it looks like this:

    theuser@desktop1:~$ sudo /opt/netronome/bin/nfp-hwinfo
    /opt/netronome/bin/nfp-hwinfo: Failed to open NFP device 0 (No such device)
    Please check that:
     -lspci -d 19ee: shows atleast one Netronome device
     -the nfp device number is correct
     -the user has read and write permissions to the Netronome device
     -the nfp.ko module is loaded
     -the nfp_dev_cpp option is enabled (please try modinfo nfp to see all params)
    theuser@desktop1:~$ 
    stop, do not continue. Go back and check if nfp_dev_cpp = 1 and also if the vm was configured to support PCIe cards. Do not continue until you have checked and addressed these two items.

Coding, at last!

This Hello World was stolen from the Netronome appropriately named Hello World example. I will be rushing through it, concentrating on getting it to compile and showing some common issues. Lookup on the example docs for what each line does.

  1. So we create our hello world project using lab_template as the, well, template.
    mkdir dev
    cd dev
    git clone https://github.com/open-nfpsw/c_packetprocessing.git
    cd c_packetprocessing/apps/
    cp -r lab_template lab_hello_world
    cd lab_hello_world
    NOTE: This creates a ~/dev/c_packetprocessing/apps/lab_hello_world directory. If you want to move it to a different location, edit the line
    ROOT_SRC_DIR  ?= $(realpath $(app_src_dir)/../..)
    in the Makefile.
  2. So far the hello world directory looks rather bare:

    theuser@desktop1:~/dev/c_packetprocessing/apps/lab_hello_world$ ls
    Makefile  README
    theuser@desktop1:~/dev/c_packetprocessing/apps/lab_hello_world$

    so let's start populating it.

    cat > hello_world.c << 'EOF'
    #include <nfp.h>
    __declspec(ctm) int old[] = {1,2,3,4,5,6,7,8,9,10};
    __declspec(ctm) int new[sizeof(old)/sizeof(int)];
    
    int main(void)
    {
            if (__ctx() == 0)
            {
                    int i, size;
                    size = sizeof(old)/sizeof(int);
                    for (i=0; i < size; i++)
                    {
                            new[i] = old[size - i - 1];
                    }
            }
            return 0;
    }
    EOF
  3. We add a few lines to the makefile. Their explanation is listed in.

    sed -i -e '/^# Application definition starts here/ a\
    $(eval $(call micro_c.compile_with_rtl,hello_world_obj,hello_world.c)) \
    $(eval $(call fw.add_obj,hello_world,hello_world_obj,i32.me0 i32.me1)) \
    $(eval $(call fw.link_with_rtsyms,hello_world))' Makefile
  4. Time for some compiling!

    theuser@desktop1:~/dev/c_packetprocessing/apps/lab_hello_world$ make
    /opt/netronome/bin/nfcc -Fo/home/theuser/dev/c_packetprocessing/apps/lab_hello_world/ -Fe/home/theuser/dev/c_packetprocessing/apps/lab_hello_world/hello_world_obj.list -W3 -chip nfp-4xxx-b0 -Qspill=7 -Qnn_mode=1 -Qno_decl_volatile -single_dram_signal -Qnctx_mode=8 -I. -I/home/theuser/dev/c_packetprocessing/microc/include -I/home/theuser/dev/c_packetprocessing/microc/lib   /opt/netronome/components/standardlibrary/microc/src/rtl.c /home/theuser/dev/c_packetprocessing/apps/lab_hello_world/hello_world.c
    /opt/netronome/bin/nfld -chip nfp-4xxx-b0 -mip -rtsyms -o /home/theuser/dev/c_packetprocessing/apps/lab_hello_world/hello_world.fw -map /home/theuser/dev/c_packetprocessing/apps/lab_hello_world/hello_world.map -u i32.me0 /home/theuser/dev/c_packetprocessing/apps/lab_hello_world/hello_world_obj.list -u i32.me1 /home/theuser/dev/c_packetprocessing/apps/lab_hello_world/hello_world_obj.list
    theuser@desktop1:~/dev/c_packetprocessing/apps/lab_hello_world$

    which creates a few intermediate files:

    theuser@desktop1:~/dev/c_packetprocessing/apps/lab_hello_world$ ls
    hello_world.c   hello_world.map  hello_world_obj.list  README
    hello_world.fw  hello_world.obj  Makefile              rtl.obj
    theuser@desktop1:~/dev/c_packetprocessing/apps/lab_hello_world$
    theuser@desktop1:~/dev/c_packetprocessing/apps/lab_hello_world$ cat hello_world.map
    Memory Map file: /home/theuser/dev/c_packetprocessing/apps/lab_hello_world/hello_world.map
    Date: Tue May  7 10:39:30 2019
    
    nfld version: 6.0.4.1,  NFFW: /home/theuser/dev/c_packetprocessing/apps/lab_hello_world/hello_world.fw
    
    Address       Region     ByteSize        Symbol
    ===================================================
    0x0000000000800000    i24.emem      108                 .mip
    0x0000000000000000    i32.ctm       704                 i32.me0.ctm_40$tls
    0x00000000000002c0    i32.ctm       704                 i32.me1.ctm_40$tls
    
    ImportVar                       Uninitialized Value
    ===================================================
    theuser@desktop1:~/dev/c_packetprocessing/apps/lab_hello_world$
  5. Next upload the firmware we created into the card. This needs to be run either as root or as an user who can write to card.
    root@desktop1:/home/theuser/dev/c_packetprocessing/apps/lab_hello_world# make load_hello_world
    nfp-nffw load --no-start /home/theuser/dev/c_packetprocessing/apps/lab_hello_world/hello_world.fw
    root@desktop1:/home/theuser/dev/c_packetprocessing/apps/lab_hello_world#
    NOTE: If you see the following error message
    theuser@desktop1:~/dev/c_packetprocessing/apps/lab_hello_world$ make load_hello_world
    nfp-nffw load --no-start /home/centos/dev/c_packetprocessing/apps/lab_hello_world/hello_world.fw
    nfp-nffw: Failed to open NFP device 0 (No such device)
    Please check that:
     -lspci -d 19ee: shows atleast one Netronome device
     -the nfp device number is correct
     -the user has read and write permissions to the Netronome device
     -the nfp.ko module is loaded
     -the nfp_dev_cpp option is enabled (please try modinfo nfp to see all params)
    nfp-nffw: Command 'load' failed
    make: *** [load_hello_world] Error 1
    theuser@desktop1:~/dev/c_packetprocessing/apps/lab_hello_world$
    ou should check if
    • If you are running make load_hello_world as user who can write to the card.
    • nfp_dev_cpp = 1
    • the vm was configured to support PCIe cards.
    Go back in this document for instructions on how to do so.

    Now, if you see this error message

    [F] nfp6000_nffw.c:4643: Firmware already loaded. Unload first.
    Failed to load firmware: Operation not permitted
    nfp-nffw: Command 'load' failed
    Makefile:43: recipe for target 'load_hello_world' failed
    either you or someone else had already loaded firmware into the card. All you have to do is unload it
    theuser@desktop1:~/dev/c_packetprocessing/apps/lab_hello_world$ nfp-nffw unload
    theuser@desktop1:~/dev/c_packetprocessing/apps/lab_hello_world$
    and then run make load_hello_world again.

  6. In the hello world instructions, the next step is to see the card memory since later on we will be writing to it. So, here is it.
    root@desktop1:/home/theuser/dev/c_packetprocessing/apps/lab_hello_world# nfp-rtsym --len 176 i32.me0.ctm_40\$tls:0
    0x0000000000:  0x00000001 0x00000002 0x00000003 0x00000004
    0x0000000010:  0x00000005 0x00000006 0x00000007 0x00000008
    0x0000000020:  0x00000009 0x0000000a 0x00000000 0x00000000
    0x0000000030:  0x00000000 0x00000000 0x00000000 0x00000000
    *
    0x0000000050:  0x00000000 0x00000000 0x00000001 0x00000002
    0x0000000060:  0x00000003 0x00000004 0x00000005 0x00000006
    0x0000000070:  0x00000007 0x00000008 0x00000009 0x0000000a
    0x0000000080:  0x00000000 0x00000000 0x00000000 0x00000000
    *
    
    root@desktop1:/home/theuser/dev/c_packetprocessing/apps/lab_hello_world#
  7. Unleash the code so it does things:
    root@desktop1:/home/theuser/dev/c_packetprocessing/apps/lab_hello_world# make fw_start
    nfp-nffw start
    root@desktop1:/home/theuser/dev/c_packetprocessing/apps/lab_hello_world# 
  8. If things were successfully done, we now can see the memory contents have changed:
    root@desktop1:/home/theuser/dev/c_packetprocessing/apps/lab_hello_world# nfp-rtsym --len 176 i32.me0.ctm_40\$tls:0
    0x0000000000:  0x00000001 0x00000002 0x00000003 0x00000004
    0x0000000010:  0x00000005 0x00000006 0x00000007 0x00000008
    0x0000000020:  0x00000009 0x0000000a 0x00000000 0x00000000
    0x0000000030:  0x0000000a 0x00000009 0x00000008 0x00000007
    0x0000000040:  0x00000006 0x00000005 0x00000004 0x00000003
    0x0000000050:  0x00000002 0x00000001 0x00000001 0x00000002
    0x0000000060:  0x00000003 0x00000004 0x00000005 0x00000006
    0x0000000070:  0x00000007 0x00000008 0x00000009 0x0000000a
    0x0000000080:  0x00000000 0x00000000 0x00000000 0x00000000
    *
    
    root@desktop1:/home/theuser/dev/c_packetprocessing/apps/lab_hello_world#
  9. Don't forget to unload the firmware by typing nfp-nffw unload!
  10. Checking that we are done
    root@desktop1:/home/theuser/dev/c_packetprocessing/apps/lab_hello_world# nfp-rtsym --len 176 i32.me0.ctm_40\$tls:0
    No runtime symbol named 'i32.me0.ctm_40$tls'
    root@desktop1:/home/theuser/dev/c_packetprocessing/apps/lab_hello_world#

So congratulations! You not only installed the SDK and wrote and ran your first Netronome program! You may want to look into the Network Flow C Compiler User's Guide for further info on what you can do; I would put the link but right on the front page it states it is Proprietary and Confidential.

Next time we will do some openflow or P4 coding. Don't ask me to tell which one will be because I have not decided yet. Brain hurts!

What about the simulator? Maybe one day.

Friday, May 03, 2019

Passing a Network card to a KVM vm guest because we are too lazy to configure SR-IOV

This can be taken as a generic how-to about passing PCIe cards to a mv guest. I will

Why

I can come up with a lot of excuses. The bottom line is you want the vm guest to do something with the card the vm host can't or shouldn't. For instance, what if we want to give a wireless card for a given vm guest? And the card is not supported by the vm host (I am looking at you, VMWare ESXi) or the vm host does not know how to virtualize it in a meaningful way?

Note: What we are talking about here should work with any PCI/PCIe card, but we said we will be talking about network cards, so there.

The Card

The card is a PCIe network card; for this article it should be seen as a garden-variety network card. You probably will not let me leave at that so, here is the info on the specific card I will be using in this article: it is a Netronome Agilio CX 2x10GbE (the one in the picture is a CX 1x40GbE, which I happen to own hence the crappy picture), which is built around their NFP-4000 flow processor. Basic informercial on it can be found at https://www.netronome.com/m/documents/PB_Agilio_CX_2x10GbE.pdf (it used to be https://www.netronome.com/media/documents/PB_Agilio_CX_2x10GbE.pdf, but I guess they thought media was too long a word. It also means that sometime after this article is posted the link will change again; no point on making them orange links). It is supposed to do things like KVM hypervisor support (SR-IOV comes to mind) right out of the box, so why we would want to passthrough the entire card to a vm guest? Here are some reasons:

  • What if the card can do something and the VM abstraction layers do not expose that?
  • What if we want to program the card to do our bidding?
  • What if we want to change the firmware of the card? Some cards allow you to upgrade the firmware, or change it completely to use it for other thingies (the Netronome card in question fits this second option, details about that might be discussed in a future article).
  • Why did you pick this card? Hey, this is not a reason to pass the entire card, but I will answer it anyway: because I have a box with 3 of them I was going to use for something else (we may talk about that in a future article). With that said, I avoided going over any of the special sauce this card has. For the purpose of this article, it is just a PCIe card I want to give to a vm guest.

How

Finding the card

Ok, card is inserted into the vm host, which booted properly. Now what? Well, we need to find where the card is so we can tell our guests. Most Linux distros come with lspci, which probulates the PCI bus. The trick is to search for the right pattern. Let's for instance look for network devices in one of my ESXi nodes:

[root@vmhost2:~] lspci | grep 'Network'
0000:00:19.0 Network controller: Intel Corporation 82579LM Gigabit Network Connection [vmnic0]
0000:04:00.0 Network controller: Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) [vmnic1]
0000:04:00.1 Network controller: Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) [vmnic2]
0000:05:00.0 Network controller: Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) [vmnic3]
0000:05:00.1 Network controller: Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) [vmnic4]
[root@vmhost2:~]

Notes

  1. ESXi is really not Linux but freebsd with gnu packages sprinkled over
  2. I just mentioned ESXi here because I needed another system I could run lscpi on.
  3. The lscpi options in ESXi are not as extensive as in garden-variety Linux. But, it is good enough to show it in action.
  4. If we had searched for Intel Corporation we would get much much more replies including the CPU itself. So, taking the time to get the right search string pays off.

If we were going to probulate in a Linux host, Ethernet works better than Network as the search pattern. We can even look at virtual interfaces KVM is feeding to a vm guest:

theuser@desktop1:~$ lspci |grep Ethernet
00:03.0 Ethernet controller: Red Hat, Inc. Virtio network device
00:06.0 Ethernet controller: Red Hat, Inc. Virtio network device
theuser@desktop1:~$

Note that the 0000: is assumed. A very useful option available in the Linux version of lspci but not the ESXi one is -nn:

theuser@desktop1:~$ lspci -nn |grep Ethernet
00:03.0 Ethernet controller [0200]: Red Hat, Inc. Virtio network device [1af4:1000]
00:06.0 Ethernet controller [0200]: Red Hat, Inc. Virtio network device [1af4:1000]
theuser@desktop1:~$

The [1af4:1000] means [vendor_id:product_id]; remember it well.

For the Netronome cards we can just look for netronome since there should be no other devices matching that name besides the cards made by them:

raub@vmhost ~$ sudo lspci -nn|grep -i netronome
11:00.0 Ethernet controller [0200]: Netronome Systems, Inc. Device [19ee:4000]
raub@vmhost ~$

The card's PCI address is 11:00.0

Handing out the card to the guest

Two things we need to do when passing a PCI device to a vm guest (a.k.a. desktop1 in this example):

  1. Tell the vm host to keep its hands off it. The reason is that, in the case of a network card, it might want to configure it, creating interfaces (in the /dev/ directory tree) which either the host server (vmhost) can use for its own nefarious uses or so KVM can then virtualize (as a Virtio network device or some other emulation) to hand out to the guests. Since we want to use said card for our personal private personal nefarious purposes within a specific vm guest (desktop1), we are not going to be nice and share it.

    So we need to tell vmhost to leave it alone.

    • KVM knows it exists because it can look in the PCI chain by itself:
      [root@vmhost ~]# virsh nodedev-list | grep pci_0000_11
      pci_0000_11_00_0
      [root@vmhost ~]#
    • So now we can tell vmhost to leave pci-0000:11:00.0 alone:

      [root@vmhost ~]$ sudo virsh nodedev-dettach pci_0000_11_00_0
      Device pci_0000_11_00_0 detached
      
      [root@vmhost ~]$
  2. Tell the vm guest there is this shiny card it can lay its noodly appendages on.
    1. Shut the vm guest down.
    2. Edit the desktop.
      virsh edit desktop
    3. Add something like
      <hostdev mode='subsystem' type='pci' managed='yes'>
            <source>
                <address domain='0x0000' bus='0x11' slot='0x00' function='0x0'/>
            </source>
          </hostdev>
      to the end of the devices session. When you save it, it will properly place
      and configure the entry.
    4. Restart vm guest check if it can see the card using dmesg (Ubuntu 19.04 example. Note it is being listed as pci-0000:04:00.0 inside the vm guest). I expect to see something like

      [    7.348276] Netronome NFP CPP API
      [    7.352347] nfp-net-vnic: NFP vNIC driver, Copyright (C) 2010-2015 Netronome Systems
      [    7.361865] nfp 0000:04:00.0: Netronome Flow Processor NFP4000/NFP5000/NFP6000 PCIe Card Probe
      [    7.372133] nfp 0000:04:00.0: RESERVED BARs: 0.0: General/MSI-X SRAM, 0.1: PCIe XPB/MSI-X PBA, 0.4: Explicit0, 0.5: Explicit1, free: 20/24
      [    7.396094] nfp 0000:11:00.0: Model: 0x40010010, SN: 00:15:4d:13:5d:58, Ifc: 0x10ff

      But what I am getting is something more like this:

      [    1.768683] nfp: NFP PCIe Driver, Copyright (C) 2014-2017 Netronome Systems
      [    1.773014] nfp 0000:00:07.0: Netronome Flow Processor NFP4000/NFP5000/NFP6000 PCIe Card Probe
      [    1.774066] nfp 0000:00:07.0: 63.008 Gb/s available PCIe bandwidth (8 GT/s x8 link)
      [    1.775212] nfp 0000:00:07.0: can't find PCIe Serial Number Capability
      [    1.776252] nfp 0000:00:07.0: Interface type 15 is not the expected 1
      [    1.777285] nfp 0000:00:07.0: NFP6000 PCI setup failed

      What is going on? The answer to that is the next topic. You see,

PCIe is more demanding

Do you remember the can't find PCIe Serial Number Capability message? This is a PCIe card, meaning we need to setup the vm guest machine type to q35, which supports the ICH9 chipset which can handle a PCIe bus. The default (I440FX) can only do PCI bus. QEMU has a nice description on the difference. So, let's give it a try by recreating the KVM guest:

virt-install \
   --name desktop1 \
   --disk path=/home/raub/desktop1.qcow2,format=qcow2,size=10 \
   --ram 4098 --vcpus 2 \
   --cdrom /export/public/ISOs/Linux/ubuntu/ubuntu-16.04.5-server-amd64.iso  \
   --os-type linux --os-variant ubuntu19.04 \
   --network network=default \
   --graphics vnc --noautoconsole \
   --machine=q35 \
   --arch x86_64

When we try to build that vm guest, we get an error message stating that

ERROR    No domains available for virt type 'hvm', arch 'x86_64', machine type 'q35'

What now? You see, at the time I wrote this, the CentOS KVM package does not support q35 out of the box. We need more packages!

yum install centos-release-qemu-ev
yum update
reboot

And we try again, this time when we login to the guest, desktop1, it looks more promising (note the PCI address changed to 0000:01:00.0; this is a new vm guest):

theuser@desktop1:~$ dmesg |grep -i netro
[    1.922051] nfp: NFP PCIe Driver, Copyright (C) 2014-2017 Netronome Systems
[    1.954196] nfp 0000:01:00.0: Netronome Flow Processor NFP4000/NFP5000/NFP6000 PCIe Card Probe
[    2.239018] nfp 0000:01:00.0: nfp:   netronome/serial-00-15-4d-13-5d-46-10-ff.nffw: not found
[    2.239059] nfp 0000:01:00.0: nfp:   netronome/pci-0000:01:00.0.nffw: not found
[    2.239913] nfp 0000:01:00.0: nfp:   netronome/nic_AMDA0096-0001_2x10.nffw: found, loading...
[   11.954477] nfp 0000:01:00.0 eth0: Netronome NFP-6xxx Netdev: TxQs=2/32 RxQs=2/32
[   11.971175] nfp 0000:01:00.0 eth1: Netronome NFP-6xxx Netdev: TxQs=2/31 RxQs=2/31
theuser@desktop1:~$

Which then becomes

theuser@desktop1:~$ ip a
1: lo:  mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: enp0s3:  mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:d4:9e:50 brd ff:ff:ff:ff:ff:ff
    inet 192.168.122.105/24 brd 192.168.122.255 scope global dynamic enp0s3
       valid_lft 3489sec preferred_lft 3489sec
    inet6 fe80::5054:ff:fed4:9e50/64 scope link
       valid_lft forever preferred_lft forever
3: enp1s0np0:  mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 00:15:4d:13:5d:47 brd ff:ff:ff:ff:ff:ff
4: enp1s0np1:  mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 00:15:4d:13:5d:48 brd ff:ff:ff:ff:ff:ff
theuser@desktop1:~$

And now we can do something useful with it.

References

  • https://stackoverflow.com/questions/14061840/kvm-and-libvirt-wrong-cpu-type-in-virtual-host
  • https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/virtualization_deployment_and_administration_guide/sect-kvm_guest_virtual_machine_compatibility-supported_cpu_models
  • https://github.com/libvirt/libvirt/blob/v4.0.0/src/util/virarch.c#L37