Libvirt xml rpc error

libvirt XML-RPC error occurs when the libvirtd server-side daemon has problems communicating with XML-RPC protocol that is used to exchange information.

Is your Ovirt panel throwing up libvirt XML-RPC error?

The libvirtd is the server-side daemon while XML-RPC is a protocol used to exchange information. And, when there are problems in their communication, it results in an error.

At Bobcares, we often receive requests to fix Libvirt errors with virtual machines as part of our Server Management Services.

Today, let’s discuss this in detail and see how our Support Engineers fix libvirt XML-RPC error.

Causes for libvirt XML-RPC error

The libvirtd is the server-side daemon component of the libvirt virtualization management system. This daemon runs on host servers and performs management tasks for virtual guest machines.

XML-RPC is a simple protocol used to exchange information between computer systems over a network.

Usually, this error occurs when the client is unable to talk to the daemon. There can be multiple reasons that make the client unable to talk to the daemon.

One of the reasons is that the daemon may not be running in the host server. Another reason can be because of an incorrect API or crashed connection instances.

Let us discuss how our Support Engineers resolve the error.

How to fix libvirt XML-RPC error

Recently one of our customers contacted us with the error libvirt XML-RPC error.

Let us discuss how our Support Engineers resolved the error.

1. libvirtd service down

The virtual machine hosted on the server was not running. Our Support Engineers logged in to the server to analyze the error.

We found that the server was rebooted recently. So on checking the libvirtd service, we found that it was down. We started the service using the command:

service libvirtd start

Later we enabled the service to start on startup using the command.

systemctl enable libvirtd

2. Bug in libvirt

Another customer contacted with the same error. On analyzing the error we found that the libvirtd service was running. A patch was also released at the time. So we updated the libvirt patch.

After the update, the error was resolved.

3. Restart libvirtd service

One of the most common fixes for the error and risky method is restarting the libvirtd service. Restarting the libvirtd service might take down other VMs in the node as well.

However, in the latest version, restarting might not affect other VM. The older version will kill VMs if we restart the libvirtd. We restart the service only if the customer approves that other VM can go offline for a while during the process.

[Stuck with librvrtd errors? We are available 24×7 to fix it for you.]

Conclusion

In short, We have discussed the causes for libvirt: XML-RPC error. Also, we have discussed how our Support Engineers start the service and update the patch to resolve the error.

PREVENT YOUR SERVER FROM CRASHING!

Never again lose customers to poor server speed! Let us help you.

Our server experts will monitor & maintain your server 24/7 so that it remains lightning fast and secure.

GET STARTED

var google_conversion_label = «owonCMyG5nEQ0aD71QM»;

Contents

  • 1 Common XML problems
    • 1.1 Editing domain definition
    • 1.2 Passing XML documents to libvirt
    • 1.3 XML documents stored by libvirt
  • 2 XML Syntax errors
    • 2.1 Extra stray «<» in the document
      • 2.1.1 Symptom
      • 2.1.2 Solution
    • 2.2 Unterminated attribute
      • 2.2.1 Symptom
      • 2.2.2 Solution
    • 2.3 Forgotten «/» in unpaired tag, unended tag
      • 2.3.1 Symptom
      • 2.3.2 Solution
    • 2.4 Typos in tags
      • 2.4.1 Symptom
      • 2.4.2 Solution
  • 3 Logic and configuration errors
    • 3.1 Vanishing parts
    • 3.2 Validating XML against libvirt schemas
    • 3.3 Incorrect drive device type
      • 3.3.1 Symptom
      • 3.3.2 Solution

Common XML problems

XML documents are used across libvirt to store structured data. This document describes common problems with XML documents that are passed to libvirt through the API. Mis-formatted XML documents, inappropriate values, missing elements, etc may produce errors that this document helps to identify and eliminate.

Editing domain definition

Although not recommended it’s sometimes needed (or easier to perform a task) to edit a domain XML file manually. Domain’s XML can be edited with the following command:

# virsh edit dom

This command starts an editor with the current definition of the domain. After you finish your edits and save the changes, the XML is reloaded and parsed by libvirt. If the XML was correct, following message is displayed:

# virsh edit dom

Domain dom XML configuration edited.

Please note, that when using the «edit» command in virsh after incorrectly editing a XML, the changes are not saved. Consider saving your changes before exiting the editor.

Once you have saved the XML it is possible to use the virt-xml-validate command to check for usage problems:

# virt-xml-validate config.xml

If there are no errors then your description is well-formed from an XML
point of view and matches the libvirt schema. The schema cannot catch all constraints, but fixing any reported errors will get you further along.

Passing XML documents to libvirt

Numerous API functions of libvirt (and their implementation in virsh) take XML documents as their arguments. This documents are passed to the XML parser, that detects syntax errors and then are processed internaly. Most errors

XML documents stored by libvirt

These documents contain definitions of domains, their states and configurations. All of those documents are automatically generated and should not be edited manually. Errors in these documents contain the file name of the broken document. The file name is valid only on the host machine defined by the URI. (It may be the machine the command is run on.)

Errors in files created by libvirt are very rare. One possible source of these errors is a downgrade of libvirt; newer versions of libvirt will always be able to read XML generated by older versions, but older versions of libvirt may be confused by XML elements added in newer libvirt.

XML Syntax errors

Syntax errors are caught by the XML parser. The error message contains useful
information that can lead to identification of the problem. Example error
message from XML parsing:

  error: (domain_definition):6: StartTag: invalid element name
  <vcpu>2</vcpu><
-----------------^

The error message consists of three lines. First line holds the error message
and the two following lines contain context of the XML file containing the
error and a pointer to ease identification of the error.

Information contained in this message:

(domain_definition)
File name of the document that contains the error. File names in parentheses are symbolic names to describe XML documents parsed from memory (they don’t directly correspond to files on disk). File names that are not contained in parentheses are local files that reside on the target of the connection.
6
Line number that contains the error.
StartTag
invalid element name
Error message from the libxml2 parser.

This snippet of a domain XML contains an extra «<» in the document:

<domain type='kvm'>
  <name>domain</name><
  <memory>524288</memory>
  <vcpu>2</vcpu>
  ...

Symptom

Libvirt produces following message:

  error: (domain_definition):6: StartTag: invalid element name
  <vcpu>2</vcpu><
-----------------^

This error message describes that the parser expects a new element name after the ‘<‘ symbol on line 6 of a domain definition XML document that was provided as a string.

Solution

Remove the extra «<» or finish the new element.

Unterminated attribute

This snippet of a domain XML contains a unterminated element attribute value:

<domain type='kvm>
  <name>domain</name>
  ...

Symptom

Libvirt produces following less obvious message:

error: (domain_definition):2: Unescaped '<' not allowed in attributes values
  <name>domain</name>
--^

Solution

Close all attribute value strings. (quotation marks and apostrophes)

Forgotten «/» in unpaired tag, unended tag

Following snippet contains an unended pair tag:

<domain type='kvm'>
 ...
 <features>
   <acpi/>
   <pae/>
 ...
</domain>

This is a example of a similar problem with a extra closing tag:

<domain type='kvm'>
  </name>
  ...
</domain>

Unpaired tags have to be ended with «/>». The following snippet contains an example of a tag not following this rule:

<domain type='kvm'>
  ...
  <clock offset='utc'>

Symptom

All of the errors above create the same error message:

error: (domain_definition):61: Opening and ending tag mismatch: clock line 16 and domain
</domain>
---------^

Identifying the root of the error is a little bit tricky. The error message contains three hints to identify the offending tag. The message after the last colon clock line 16 and domain states that the offending tag is <clock … on line 16 of the source documment. The last hint is the tag provided in the context part of the message, that identifies the second offending tag.

Solution

End tags properly.

Typos in tags

Following examples contain flawed XML tags by a whitespace or special character typo

<domain type 'kvm'>
  ...
<dom ain type='kvm'>
  ...
<dom#ain type='kvm'>
  ...

Symptom

All of the mistakes above produce the following error message:

error: (domain_definition):1: Specification mandate value for attribute ty
<domain ty pe='kvm'>
-----------^

Solution

To identify the problematic tag, follow the guide provided by the pointer and context of the file.

Logic and configuration errors

A well formatted XML can contain errors that are syntactically correct although are not allowed in libvirt. There is a vast amount of these errors. This document will try to identify and describe the most tricky and common ones.

Vanishing parts

The symptom is that parts of the change you made have no effect and do not show up once after editing or defining the domain, the define or edit command works but if you dump the XML again the change you made disappeared.
A classical error when writing XML for libvirt from scratch instead of getting
it generated is to use a broken construct or making an error in some of the syntax, like a misplaced or wrong tag or attribute name. The problem is that
libvirt will generally only look for constructs that it knows and ignore
everything else. So some of your changes made to the XML may vanish after
libvirt parsed your input.
The best way to check against such problems is to validate the XML input
before passing it to edit or define commands:

Validating XML against libvirt schemas

Libvirt developers maintain a set of XML schemas bundled with libvirt and
defining as much as possible the constructs allowed in XML documents used by libvirt. You can validate libvirt XML files using the following command:

# virt-xml-validate libvirt.xml

If that passes then chances are that libvirt will understand all constructs from your XML, though that is not an absolute guarantee, for example the schemas cannot detect options which are valid only for a given hypervisor, but that should help pinpointing problems.
Any XML generated by libvirt for example as a result of a virsh dump command should validate without error (and failure to do so should be reported to the developers for fixing, thanks !)

Incorrect drive device type

Add a new cdrom drive or modify a existing one with the following XML:

<disk type='block' device='cdrom'>
  <driver name='qemu' type='raw'/>
  <source file='/path/to/image.iso'/>
  <target dev='hdc' bus='ide'/>
  <readonly/>
</disk>

Symptom

Definition of the source image for the cd-rom virtual drive is not present despite being added:

# virsh dumpxml domain
<domain type='kvm'>
  ...
  <disk type='block' device='cdrom'>
    <driver name='qemu' type='raw'/>
    <target dev='hdc' bus='ide'/>
    <readonly/>
  </disk>
  ...
</domain>

Solution

A disk device of type block expects that the source is a physical device. To use the disk with a image file use type file instead.

Skip to navigation
Skip to main content

Red Hat Customer Portal

Infrastructure and Management

  • Red Hat Enterprise Linux

  • Red Hat Virtualization

  • Red Hat Identity Management

  • Red Hat Directory Server

  • Red Hat Certificate System

  • Red Hat Satellite

  • Red Hat Subscription Management

  • Red Hat Update Infrastructure

  • Red Hat Insights

  • Red Hat Ansible Automation Platform

Cloud Computing

  • Red Hat OpenShift

  • Red Hat CloudForms

  • Red Hat OpenStack Platform

  • Red Hat OpenShift Container Platform

  • Red Hat OpenShift Data Science

  • Red Hat OpenShift Online

  • Red Hat OpenShift Dedicated

  • Red Hat Advanced Cluster Security for Kubernetes

  • Red Hat Advanced Cluster Management for Kubernetes

  • Red Hat Quay

  • OpenShift Dev Spaces

  • Red Hat OpenShift Service on AWS

Storage

  • Red Hat Gluster Storage

  • Red Hat Hyperconverged Infrastructure

  • Red Hat Ceph Storage

  • Red Hat OpenShift Data Foundation

Runtimes

  • Red Hat Runtimes

  • Red Hat JBoss Enterprise Application Platform

  • Red Hat Data Grid

  • Red Hat JBoss Web Server

  • Red Hat Single Sign On

  • Red Hat support for Spring Boot

  • Red Hat build of Node.js

  • Red Hat build of Thorntail

  • Red Hat build of Eclipse Vert.x

  • Red Hat build of OpenJDK

  • Red Hat build of Quarkus

Integration and Automation

  • Red Hat Process Automation

  • Red Hat Process Automation Manager

  • Red Hat Decision Manager

All Products

This appendix documents common libvirt-related problems and errors along with instructions for dealing with them.

Locate the error on the table below and follow the corresponding link under Solution for detailed troubleshooting information.

Table A.1. Common libvirt errors

Error Description of problem Solution
libvirtd failed to start The libvirt daemon failed to start. However, there is no information about this error in /var/log/messages. Section A.19.1, “libvirtd failed to start”
Cannot read CA certificate This is one of several errors that occur when the URI fails to connect to the hypervisor. Section A.19.2, “The URI Failed to Connect to the Hypervisor”
Other connectivity errors These are other errors that occur when the URI fails to connect to the hypervisor. Section A.19.2, “The URI Failed to Connect to the Hypervisor”
PXE boot (or DHCP) on guest failed A guest virtual machine starts successfully, but is unable to acquire an IP address from DHCP, boot using the PXE protocol, or both. This is often a result of a long forward delay time set for the bridge, or when the iptables package and kernel do not support checksum mangling rules. Section A.19.3, “PXE Boot (or DHCP) on Guest Failed”
Guest can reach outside network, but cannot reach host when using macvtap interface

A guest can communicate with other guests, but cannot connect to the host machine after being configured to use a macvtap (or type='direct') network interface.

This is actually not an error — it is the defined behavior of macvtap.

Section A.19.4, “Guest Can Reach Outside Network, but Cannot Reach Host When Using macvtap interface”
Could not add rule to fixup DHCP response checksums on network 'default' This warning message is almost always harmless, but is often mistakenly seen as evidence of a problem. Section A.19.5, “Could not add rule to fixup DHCP response checksums on network ‘default’
Unable to add bridge br0 port vnet0: No such device This error message or the similar Failed to add tap interface to bridge 'br0': No such device reveal that the bridge device specified in the guest’s (or domain’s) <interface> definition does not exist. Section A.19.6, “Unable to add bridge br0 port vnet0: No such device”
Unable to resolve address name_of_host service '49155': Name or service not known QEMU guest migration fails and this error message appears with an unfamiliar host name. Section A.19.7, “Migration Fails with error: unable to resolve address
Unable to allow access for disk path /var/lib/libvirt/images/qemu.img: No such file or directory A guest virtual machine cannot be migrated because libvirt cannot access the disk image(s). Section A.19.8, “Migration Fails with Unable to allow access for disk path: No such file or directory
No guest virtual machines are present when libvirtd is started The libvirt daemon is successfully started, but no guest virtual machines appear to be present when running virsh list --all. Section A.19.9, “No Guest Virtual Machines are Present when libvirtd is Started”
Common XML errors libvirt uses XML documents to store structured data. Several common errors occur with XML documents when they are passed to libvirt through the API. This entry provides instructions for editing guest XML definitions, and details common errors in XML syntax and configuration. Section A.19.10, “Common XML Errors”

A.19.1. libvirtd failed to start

Symptom

The libvirt daemon does not start automatically. Starting the libvirt daemon manually fails as well:

# systemctl start libvirtd.service
* Caching service dependencies ...                                                                                             [ ok ]
* Starting libvirtd ...
/usr/sbin/libvirtd: error: Unable to initialize network sockets. Check /var/log/messages or run without --daemon for more info.
* start-stop-daemon: failed to start `/usr/sbin/libvirtd'                                                                      [ !! ]
* ERROR: libvirtd failed to start

Moreover, there is not 'more info' about this error in /var/log/messages.

Investigation

Change libvirt’s logging in /etc/libvirt/libvirtd.conf by enabling the line below. To enable the setting the line, open the /etc/libvirt/libvirtd.conf file in a text editor, remove the hash (or #) symbol from the beginning of the following line, and save the change:

log_outputs="3:syslog:libvirtd"

This line is commented out by default to prevent libvirt from producing excessive log messages. After diagnosing the problem, it is recommended to comment this line again in the /etc/libvirt/libvirtd.conf file.

Restart libvirt to determine if this has solved the problem.

If libvirtd still does not start successfully, an error similar to the following will be printed:

# systemctl restart libvirtd
Job for libvirtd.service failed because the control process exited with error code. See "systemctl status libvirtd.service" and "journalctl -xe" for details.

Sep 19 16:06:02 jsrh libvirtd[30708]: 2017-09-19 14:06:02.097+0000: 30708: info : libvirt version: 3.7.0, package: 1.el7 (Unknown, 2017-09-06-09:01:55, js
Sep 19 16:06:02 jsrh libvirtd[30708]: 2017-09-19 14:06:02.097+0000: 30708: info : hostname: jsrh
Sep 19 16:06:02 jsrh libvirtd[30708]: 2017-09-19 14:06:02.097+0000: 30708: error : daemonSetupNetworking:502 : unsupported configuration: No server certif
Sep 19 16:06:02 jsrh systemd[1]: libvirtd.service: main process exited, code=exited, status=6/NOTCONFIGURED
Sep 19 16:06:02 jsrh systemd[1]: Failed to start Virtualization daemon.

-- Subject: Unit libvirtd.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit libvirtd.service has failed.
--
-- The result is failed.

The libvirtd man page shows that the missing cacert.pem file is used as TLS authority when libvirt is run in Listen for TCP/IP connections mode. This means the --listen parameter is being passed.

Solution

Configure the libvirt daemon’s settings with one of the following methods:

  • Install a CA certificate.

  • Do not use TLS; use bare TCP instead. In /etc/libvirt/libvirtd.conf set listen_tls = 0 and listen_tcp = 1. The default values are listen_tls = 1 and listen_tcp = 0.

  • Do not pass the --listen parameter. In /etc/sysconfig/libvirtd.conf change the LIBVIRTD_ARGS variable.

A.19.2. The URI Failed to Connect to the Hypervisor

Several different errors can occur when connecting to the server (for example, when running virsh).

A.19.2.1. Cannot read CA certificate

Symptom

When running a command, the following error (or similar) appears:

$ virsh -c qemu://$hostname/system_list
error: failed to connect to the hypervisor
error: Cannot read CA certificate '/etc/pki/CA/cacert.pem': No such file or directory
Investigation

The error message is misleading about the actual cause. This error can be caused by a variety of factors, such as an incorrectly specified URI, or a connection that is not configured.

Solution
Incorrectly specified URI

When specifying qemu://system or qemu://session as a connection URI, virsh attempts to connect to host names’ system or session respectively. This is because virsh recognizes the text after the second forward slash as the host.

Use three forward slashes to connect to the local host. For example, specifying qemu:///system instructs virsh connect to the system instance of libvirtd on the local host.

When a host name is specified, the QEMU transport defaults to TLS. This results in certificates.

Connection is not configured

The URI is correct (for example, qemu[+tls]://server/system) but the certificates are not set up properly on your machine. For information on configuring TLS, see the upstream libvirt website.

A.19.2.2. unable to connect to server at ‘host:16509’: Connection refused

Symptom

While libvirtd should listen on TCP ports for connections, the connections fail:

# virsh -c qemu+tcp://host/system
error: failed to connect to the hypervisor
error: unable to connect to server at 'host:16509': Connection refused

The libvirt daemon is not listening on TCP ports even after changing configuration in /etc/libvirt/libvirtd.conf:

# grep listen_ /etc/libvirt/libvirtd.conf
listen_tls = 1
listen_tcp = 1
listen_addr = "0.0.0.0"

However, the TCP ports for libvirt are still not open after changing configuration:

# netstat -lntp | grep libvirtd
#
Investigation

The libvirt daemon was started without the --listen option. Verify this by running this command:

# ps aux | grep libvirtd
root     10749  0.1  0.2 558276 18280 ?        Ssl  23:21   0:00 /usr/sbin/libvirtd

The output does not contain the --listen option.

Solution

Start the daemon with the --listen option.

To do this, modify the /etc/sysconfig/libvirtd file and uncomment the following line:

# LIBVIRTD_ARGS="--listen"

Then, restart the libvirtd service with this command:

# /bin/systemctl restart libvirtd.service

A.19.2.3. Authentication Failed

Symptom

When running a command, the following error (or similar) appears:

$ virsh -c qemu://$hostname/system_list
error: failed to connect to the hypervisor
error: authentication failed: authentication failed
Investigation

If authentication fails even when the correct credentials are used, it is possible that the SASL authentication is not configured.

Solution
  1. Edit the /etc/libvirt/libvirtd.conf file and set the value of the auth_tcp parameter to sasl. To verify:

    # cat /etc/libvirt/libvirtd.conf | grep auth_tcp
    auth_tcp = "sasl"
    
  2. Edit the /etc/sasl2/libvirt.conf file and add the following lines to the file:

    mech_list: digest-md5
    sasldb_path: /etc/libvirt/passwd.db
    
  3. Ensure the cyrus-sasl-md5 package is installed:

    # yum install cyrus-sasl-md5
  4. Restart the libvirtd service:

    # systemctl restart libvirtd
  5. Set a user name and password for libvirt SASL:

    # saslpasswd2 -a libvirt 1

A.19.2.4. Permission Denied

Symptom

When running a virsh command as a non-root user, the following error (or similar) appears:

$ virsh -c qemu://$hostname/system_list
error: Failed to connect socket to '/var/run/libvirt/libvirt-sock': Permission denied
error: failed to connect to the hypervisor
Solution
  1. Edit the /etc/libvirt/libvirt.conf file and add the following lines to the file:

    #unix_sock_group = "libvirt"
    #unix_sock_ro_perms = "0777"
    #unix_sock_rw_perms = "0770"
    
  2. Restart the libvirtd service:

    # systemctl restart libvirtd

A.19.3. PXE Boot (or DHCP) on Guest Failed

Symptom

A guest virtual machine starts successfully, but is then either unable to acquire an IP address from DHCP or boot using the PXE protocol, or both. There are two common causes of this error: having a long forward delay time set for the bridge, and when the iptables package and kernel do not support checksum mangling rules.

Long forward delay time on bridge
Investigation

This is the most common cause of this error. If the guest network interface is connecting to a bridge device that has STP (Spanning Tree Protocol) enabled, as well as a long forward delay set, the bridge will not forward network packets from the guest virtual machine onto the bridge until at least that number of forward delay seconds have elapsed since the guest connected to the bridge. This delay allows the bridge time to watch traffic from the interface and determine the MAC addresses behind it, and prevent forwarding loops in the network topology.

If the forward delay is longer than the timeout of the guest’s PXE or DHCP client, the client’s operation will fail, and the guest will either fail to boot (in the case of PXE) or fail to acquire an IP address (in the case of DHCP).

Solution

If this is the case, change the forward delay on the bridge to 0, disable STP on the bridge, or both.

This solution applies only if the bridge is not used to connect multiple networks, but just to connect multiple endpoints to a single network (the most common use case for bridges used by libvirt).

If the guest has interfaces connecting to a libvirt-managed virtual network, edit the definition for the network, and restart it. For example, edit the default network with the following command:

# virsh net-edit default

Add the following attributes to the <bridge> element:

<name_of_bridge='virbr0' delay='0' stp='on'/>

delay='0' and stp='on' are the default settings for virtual networks, so this step is only necessary if the configuration has been modified from the default.

If the guest interface is connected to a host bridge that was configured outside of libvirt, change the delay setting.

Add or edit the following lines in the /etc/sysconfig/network-scripts/ifcfg-name_of_bridge file to turn STP on with a 0 second delay:

STP=on DELAY=0

After changing the configuration file, restart the bridge device:

/usr/sbin/ifdown name_of_bridge
/usr/sbin/ifup name_of_bridge

If name_of_bridge is not the root bridge in the network, that bridge’s delay will be eventually reset to the delay time configured for the root bridge. To prevent this from occurring, disable STP on name_of_bridge.

The iptables package and kernel do not support checksum mangling rules
Investigation

This message is only a problem if all four of the following conditions are true:

  • The guest is using virtio network devices.

    If so, the configuration file will contain model type='virtio'

  • The host has the vhost-net module loaded.

    This is true if ls /dev/vhost-net does not return an empty result.

  • The guest is attempting to get an IP address from a DHCP server that is running directly on the host.

  • The iptables version on the host is older than 1.4.10.

    iptables 1.4.10 was the first version to add the libxt_CHECKSUM extension. This is the case if the following message appears in the libvirtd logs:

    warning: Could not add rule to fixup DHCP response checksums on network default
    warning: May need to update iptables package and kernel to support CHECKSUM rule.

    Unless all of the other three conditions in this list are also true, the above warning message can be disregarded, and is not an indicator of any other problems.

When these conditions occur, UDP packets sent from the host to the guest have uncomputed checksums. This makes the host’s UDP packets seem invalid to the guest’s network stack.

Solution

To solve this problem, invalidate any of the four points above. The best solution is to update the host iptables and kernel to iptables-1.4.10 or newer where possible. Otherwise, the most specific fix is to disable the vhost-net driver for this particular guest. To do this, edit the guest configuration with this command:

virsh edit name_of_guest

Change or add a <driver> line to the <interface> section:

<interface type='network'>
  <model type='virtio'/>
  <driver name='qemu'/>
  ...
</interface>

Save the changes, shut down the guest, and then restart it.

If this problem is still not resolved, the issue may be due to a conflict between firewalld and the default libvirt network.

To fix this, stop firewalld with the service firewalld stop command, then restart libvirt with the service libvirtd restart command.

In addition, if the /etc/sysconfig/network-scripts/ifcfg-network_name file is configured correctly, you can ensure that the guest acquires an IP address by using the dhclient command as root on the guest.

A.19.4. Guest Can Reach Outside Network, but Cannot Reach Host When Using macvtap interface

Symptom

A guest virtual machine can communicate with other guests, but cannot connect to the host machine after being configured to use a macvtap (also known as type='direct') network interface.

Investigation

Even when not connecting to a Virtual Ethernet Port Aggregator (VEPA) or VN-Link capable switch, macvtap interfaces can be useful. Setting the mode of such an interface to bridge allows the guest to be directly connected to the physical network in a very simple manner without the setup issues (or NetworkManager incompatibility) that can accompany the use of a traditional host bridge device.

However, when a guest virtual machine is configured to use a type='direct' network interface such as macvtap, despite having the ability to communicate with other guests and other external hosts on the network, the guest cannot communicate with its own host.

This situation is actually not an error — it is the defined behavior of macvtap. Due to the way in which the host’s physical Ethernet is attached to the macvtap bridge, traffic into that bridge from the guests that is forwarded to the physical interface cannot be bounced back up to the host’s IP stack. Additionally, traffic from the host’s IP stack that is sent to the physical interface cannot be bounced back up to the macvtap bridge for forwarding to the guests.

Solution

Use libvirt to create an isolated network, and create a second interface for each guest virtual machine that is connected to this network. The host and guests can then directly communicate over this isolated network, while also maintaining compatibility with NetworkManager.

Procedure A.8. Creating an isolated network with libvirt

  1. Add and save the following XML in the /tmp/isolated.xml file. If the 192.168.254.0/24 network is already in use elsewhere on your network, you can choose a different network.

    ...
    <network>
      <name>isolated</name>
      <ip address='192.168.254.1' netmask='255.255.255.0'>
        <dhcp>
          <range start='192.168.254.2' end='192.168.254.254'/>
        </dhcp>
      </ip>
    </network>
    ...
    

    Figure A.3. Isolated Network XML

  2. Create the network with this command: virsh net-define /tmp/isolated.xml

  3. Set the network to autostart with the virsh net-autostart isolated command.

  4. Start the network with the virsh net-start isolated command.

  5. Using virsh edit name_of_guest, edit the configuration of each guest that uses macvtap for its network connection and add a new <interface> in the <devices> section similar to the following (note the <model type='virtio'/> line is optional to include):

    ...
    <interface type='network' trustGuestRxFilters='yes'>
      <source network='isolated'/>
      <model type='virtio'/>
    </interface>
    

    Figure A.4. Interface Device XML

  6. Shut down, then restart each of these guests.

The guests are now able to reach the host at the address 192.168.254.1, and the host will be able to reach the guests at the IP address they acquired from DHCP (alternatively, you can manually configure the IP addresses for the guests). Since this new network is isolated to only the host and guests, all other communication from the guests will use the macvtap interface. For more information, see Section 23.17.8, “Network Interfaces”.

A.19.5. Could not add rule to fixup DHCP response checksums on network ‘default’

Symptom

This message appears:

Could not add rule to fixup DHCP response checksums on network 'default'
Investigation

Although this message appears to be evidence of an error, it is almost always harmless.

Solution

Unless the problem you are experiencing is that the guest virtual machines are unable to acquire IP addresses through DHCP, this message can be ignored.

A.19.6. Unable to add bridge br0 port vnet0: No such device

Symptom

The following error message appears:

Unable to add bridge name_of_bridge port vnet0: No such device

For example, if the bridge name is br0, the error message appears as:

Unable to add bridge br0 port vnet0: No such device

In libvirt versions 0.9.6 and earlier, the same error appears as:

Failed to add tap interface to bridge name_of_bridge: No such device

Or for example, if the bridge is named br0:

Failed to add tap interface to bridge 'br0': No such device
Investigation

Both error messages reveal that the bridge device specified in the guest’s (or domain’s) <interface> definition does not exist.

To verify the bridge device listed in the error message does not exist, use the shell command ip addr show br0.

A message similar to this confirms the host has no bridge by that name:

br0: error fetching interface information: Device not found

If this is the case, continue to the solution.

However, if the resulting message is similar to the following, the issue exists elsewhere:

br0        Link encap:Ethernet  HWaddr 00:00:5A:11:70:48
           inet addr:10.22.1.5  Bcast:10.255.255.255  Mask:255.0.0.0
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:249841 errors:0 dropped:0 overruns:0 frame:0
           TX packets:281948 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:0
	   RX bytes:106327234 (101.4 MiB)  TX bytes:21182634 (20.2 MiB)
Solution
Edit the existing bridge or create a new bridge with virsh

Use virsh to either edit the settings of an existing bridge or network, or to add the bridge device to the host system configuration.

Edit the existing bridge settings using virsh

Use virsh edit name_of_guest to change the <interface> definition to use a bridge or network that already exists.

For example, change type='bridge' to type='network', and <source bridge='br0'/> to <source network='default'/>.

Create a host bridge using virsh

For libvirt version 0.9.8 and later, a bridge device can be created with the virsh iface-bridge command. This creates a bridge device br0 with eth0, the physical network interface that is set as part of a bridge, attached:

virsh iface-bridge eth0 br0

Optional: If needed, remove this bridge and restore the original eth0 configuration with this command:

virsh iface-unbridge br0
Create a host bridge manually

A.19.7. Migration Fails with error: unable to resolve address

Symptom

QEMU guest migration fails and this error message appears:

# virsh migrate qemu qemu+tcp://192.168.122.12/system
  error: Unable to resolve address name_of_host service '49155': Name or service not known

For example, if the destination host name is newyork, the error message appears as:

# virsh migrate qemu qemu+tcp://192.168.122.12/system
error: Unable to resolve address 'newyork' service '49155': Name or service not known

However, this error looks strange as we did not use newyork host name anywhere.

Investigation

During migration, libvirtd running on the destination host creates a URI from an address and port where it expects to receive migration data and sends it back to libvirtd running on the source host.

In this case, the destination host (192.168.122.12) has its name set to ‘newyork’. For some reason, libvirtd running on that host is unable to resolve the name to an IP address that could be sent back and still be useful. For this reason, it returned the ‘newyork’ host name hoping the source libvirtd would be more successful with resolving the name. This can happen if DNS is not properly configured or /etc/hosts has the host name associated with local loopback address (127.0.0.1).

Note that the address used for migration data cannot be automatically determined from the address used for connecting to destination libvirtd (for example, from qemu+tcp://192.168.122.12/system). This is because to communicate with the destination libvirtd, the source libvirtd may need to use network infrastructure different from the type that virsh (possibly running on a separate machine) requires.

Solution

The best solution is to configure DNS correctly so that all hosts involved in migration are able to resolve all host names.

If DNS cannot be configured to do this, a list of every host used for migration can be added manually to the /etc/hosts file on each of the hosts. However, it is difficult to keep such lists consistent in a dynamic environment.

If the host names cannot be made resolvable by any means, virsh migrate supports specifying the migration host:

# virsh migrate qemu qemu+tcp://192.168.122.12/system tcp://192.168.122.12

Destination libvirtd will take the tcp://192.168.122.12 URI and append an automatically generated port number. If this is not desirable (because of firewall configuration, for example), the port number can be specified in this command:

# virsh migrate qemu qemu+tcp://192.168.122.12/system tcp://192.168.122.12:12345

Another option is to use tunneled migration. Tunneled migration does not create a separate connection for migration data, but instead tunnels the data through the connection used for communication with destination libvirtd (for example, qemu+tcp://192.168.122.12/system):

# virsh migrate qemu qemu+tcp://192.168.122.12/system --p2p --tunnelled

A.19.8. Migration Fails with Unable to allow access for disk path: No such file or directory

Symptom

A guest virtual machine (or domain) cannot be migrated because libvirt cannot access the disk image(s):

# virsh migrate qemu qemu+tcp://name_of_host/system
error: Unable to allow access for disk path /var/lib/libvirt/images/qemu.img: No such file or directory

For example, if the destination host name is newyork, the error message appears as:

# virsh migrate qemu qemu+tcp://newyork/system
error: Unable to allow access for disk path /var/lib/libvirt/images/qemu.img: No such file or directory
Investigation

By default, migration only transfers the in-memory state of a running guest (such as memory or CPU state). Although disk images are not transferred during migration, they need to remain accessible at the same path by both hosts.

Solution

Set up and mount shared storage at the same location on both hosts. The simplest way to do this is to use NFS:

Procedure A.9. Setting up shared storage

  1. Set up an NFS server on a host serving as shared storage. The NFS server can be one of the hosts involved in the migration, as long as all hosts involved are accessing the shared storage through NFS.

    # mkdir -p /exports/images
    # cat >>/etc/exports <<EOF
    /exports/images    192.168.122.0/24(rw,no_root_squash)
    EOF
  2. Mount the exported directory at a common location on all hosts running libvirt. For example, if the IP address of the NFS server is 192.168.122.1, mount the directory with the following commands:

    # cat >>/etc/fstab <<EOF
    192.168.122.1:/exports/images  /var/lib/libvirt/images  nfs  auto  0 0
    EOF
    # mount /var/lib/libvirt/images

It is not possible to export a local directory from one host using NFS and mount it at the same path on another host — the directory used for storing disk images must be mounted from shared storage on both hosts. If this is not configured correctly, the guest virtual machine may lose access to its disk images during migration, because the source host’s libvirt daemon may change the owner, permissions, and SELinux labels on the disk images after it successfully migrates the guest to its destination.

If libvirt detects that the disk images are mounted from a shared storage location, it will not make these changes.

A.19.9. No Guest Virtual Machines are Present when libvirtd is Started

Symptom

The libvirt daemon is successfully started, but no guest virtual machines appear to be present.

# virsh list --all
 Id    Name                           State
----------------------------------------------------

Investigation

There are various possible causes of this problem. Performing these tests will help to determine the cause of this situation:

Verify KVM kernel modules

Verify that KVM kernel modules are inserted in the kernel:

# lsmod | grep kvm
kvm_intel             121346  0
kvm                   328927  1 kvm_intel

If you are using an AMD machine, verify the kvm_amd kernel modules are inserted in the kernel instead, using the similar command lsmod | grep kvm_amd in the root shell.

If the modules are not present, insert them using the modprobe <modulename> command.

Although it is uncommon, KVM virtualization support may be compiled into the kernel. In this case, modules are not needed.

Verify virtualization extensions

Verify that virtualization extensions are supported and enabled on the host:

# egrep "(vmx|svm)" /proc/cpuinfo
flags		: fpu vme de pse tsc ... svm ... skinit wdt npt lbrv svm_lock nrip_save
flags		: fpu vme de pse tsc ... svm ... skinit wdt npt lbrv svm_lock nrip_save

Enable virtualization extensions in your hardware’s firmware configuration within the BIOS setup. See your hardware documentation for further details on this.

Verify client URI configuration

Verify that the URI of the client is configured as intended:

# virsh uri
vbox:///system

For example, this message shows the URI is connected to the VirtualBox hypervisor, not QEMU, and reveals a configuration error for a URI that is otherwise set to connect to a QEMU hypervisor. If the URI was correctly connecting to QEMU, the same message would appear instead as:

# virsh uri
qemu:///system

This situation occurs when there are other hypervisors present, which libvirt may speak to by default.

Solution

After performing these tests, use the following command to view a list of guest virtual machines:

# virsh list --all

A.19.10. Common XML Errors

The libvirt tool uses XML documents to store structured data. A variety of common errors occur with XML documents when they are passed to libvirt through the API. Several common XML errors — including erroneous XML tags, inappropriate values, and missing elements — are detailed below.

A.19.10.1. Editing domain definition

Although it is not recommended, it is sometimes necessary to edit a guest virtual machine’s (or a domain’s) XML file manually. To access the guest’s XML for editing, use the following command:

# virsh edit name_of_guest.xml

This command opens the file in a text editor with the current definition of the guest virtual machine. After finishing the edits and saving the changes, the XML is reloaded and parsed by libvirt. If the XML is correct, the following message is displayed:

# virsh edit name_of_guest.xml

Domain name_of_guest.xml XML configuration edited.

When using the edit command in virsh to edit an XML document, save all changes before exiting the editor.

After saving the XML file, use the xmllint command to validate that the XML is well-formed, or the virt-xml-validate command to check for usage problems:

# xmllint --noout config.xml
# virt-xml-validate config.xml

If no errors are returned, the XML description is well-formed and matches the libvirt schema. While the schema does not catch all constraints, fixing any reported errors will further troubleshooting.

XML documents stored by libvirt

These documents contain definitions of states and configurations for the guests. These documents are automatically generated and should not be edited manually. Errors in these documents contain the file name of the broken document. The file name is valid only on the host machine defined by the URI, which may see the machine the command was run on.

Errors in files created by libvirt are rare. However, one possible source of these errors is a downgrade of libvirt — while newer versions of libvirt can always read XML generated by older versions, older versions of libvirt may be confused by XML elements added in a newer version.

A.19.10.2. XML syntax errors

Syntax errors are caught by the XML parser. The error message contains information for identifying the problem.

This example error message from the XML parser consists of three lines — the first line denotes the error message, and the two following lines contain the context and location of the XML code containing the error. The third line contains an indicator showing approximately where the error lies on the line above it:

error: (name_of_guest.xml):6: StartTag: invalid element name
<vcpu>2</vcpu><
-----------------^
Information contained in this message:
(name_of_guest.xml)

This is the file name of the document that contains the error. File names in parentheses are symbolic names to describe XML documents parsed from memory, and do not directly correspond to files on disk. File names that are not contained in parentheses are local files that reside on the target of the connection.

6

This is the line number in the XML file that contains the error.

StartTag: invalid element name

This is the error message from the libxml2 parser, which describes the specific XML error.

A.19.10.2.1. Stray < in the document

Symptom

The following error occurs:

error: (name_of_guest.xml):6: StartTag: invalid element name
<vcpu>2</vcpu><
-----------------^
Investigation

This error message shows that the parser expects a new element name after the < symbol on line 6 of a guest’s XML file.

Ensure line number display is enabled in your text editor. Open the XML file, and locate the text on line 6:

<domain type='kvm'>
   <name>name_of_guest</name>
<memory>524288</memory>
<vcpu>2</vcpu><

This snippet of a guest’s XML file contains an extra < in the document:

Solution

Remove the extra < or finish the new element.

A.19.10.2.2. Unterminated attribute

Symptom

The following error occurs:

error: (name_of_guest.xml):2: Unescaped '<' not allowed in attributes values
<name>name_of_guest</name>
--^
Investigation

This snippet of a guest’s XML file contains an unterminated element attribute value:

<domain type='kvm>
<name>name_of_guest</name>

In this case, 'kvm' is missing a second quotation mark. Attribute values must be opened and closed with quotation marks or apostrophes, similar to XML start and end tags.

Solution

Correctly open and close all attribute value strings.

A.19.10.2.3. Opening and ending tag mismatch

Symptom

The following error occurs:

error: (name_of_guest.xml):61: Opening and ending tag mismatch: clock line 16 and domain
</domain>
---------^
Investigation

The error message above contains three clues to identify the offending tag:

The message following the last colon, clock line 16 and domain, reveals that <clock> contains a mismatched tag on line 16 of the document. The last hint is the pointer in the context part of the message, which identifies the second offending tag.

Unpaired tags must be closed with />. The following snippet does not follow this rule and has produced the error message shown above:

<domain type='kvm'>
  ...
    <clock offset='utc'>

This error is caused by mismatched XML tags in the file. Every XML tag must have a matching start and end tag.

Other examples of mismatched XML tags

The following examples produce similar error messages and show variations of mismatched XML tags.

This snippet contains an mismatch error for <features> because there is no end tag (</name>):

<domain type='kvm'>
 ...
 <features>
   <acpi/>
   <pae/>
 ...
 </domain>

This snippet contains an end tag (</name>) without a corresponding start tag:

<domain type='kvm'>
  </name>
  ...
</domain>
Solution

Ensure all XML tags start and end correctly.

A.19.10.3. Logic and configuration errors

A well-formatted XML document can contain errors that are correct in syntax but libvirt cannot parse. Many of these errors exist, with two of the most common cases outlined below.

A.19.10.3.1. Vanishing parts

Symptom

Parts of the change you have made do not show up and have no effect after editing or defining the domain. The define or edit command works, but when dumping the XML once again, the change disappears.

Investigation

This error likely results from a broken construct or syntax that libvirt does not parse. The libvirt tool will generally only look for constructs it knows but ignore everything else, resulting in some of the XML changes vanishing after libvirt parses the input.

Solution

Validate the XML input before passing it to the edit or define commands. The libvirt developers maintain a set of XML schemas bundled with libvirt that define the majority of the constructs allowed in XML documents used by libvirt.

Validate libvirt XML files using the following command:

# virt-xml-validate libvirt.xml

If this command passes, libvirt will likely understand all constructs from your XML, except if the schemas cannot detect options that are valid only for a given hypervisor. For example, any XML generated by libvirt as a result of a virsh dump command should validate without error.

A.19.10.3.2. Incorrect drive device type

Symptom

The definition of the source image for the CD-ROM virtual drive is not present, despite being added:

# virsh dumpxml domain
<domain type='kvm'>
  ...
  <disk type='block' device='cdrom'>
    <driver name='qemu' type='raw'/>
    <target dev='hdc' bus='ide'/>
    <readonly/>
  </disk>
  ...
</domain>
Solution

Correct the XML by adding the missing <source> parameter as follows:

<disk type='block' device='cdrom'>
  <driver name='qemu' type='raw'/>
  <source file='/path/to/image.iso'/>
  <target dev='hdc' bus='ide'/>
  <readonly/>
</disk>

A type='block' disk device expects that the source is a physical device. To use the disk with an image file, use type='file' instead.

  • #1

Hi,

I have searched the forum but not found any similar issues.

My system:

  • MB: SM X11SPi-TF
  • CPU: Intel Intel(R) Xeon(R) Silver 4116 CPU @ 2.10GHz
  • RAM: 196 GB
  • Storage: 6xWD WD Ultrastar 550 18 TB SAS3, 2xIntel Optane P4801x 100GB, 2xSamsung 970 EVO 2 TB
  • HDC: LSI 9340-8i 12G
  • NIC: 2x Intel x722 10 GB, 1xMellanox ConnectX-3 Pro

Software:

  • TrueNAS-SCALE-22.02-RC.2

I have created a couple of Ubuntu VMs through the GUI, they started without any issues so I checked «Autostart» and rebooted my server, and now they won’t start. If I try to start them manually I get an error:

Code:

Error: Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/middlewared/main.py", line 160, in call_method
    result = await self.middleware._call(message['method'], serviceobj, methodobj, params, app=self,
  File "/usr/lib/python3/dist-packages/middlewared/main.py", line 1281, in _call
    return await methodobj(*prepared_call.args)
  File "/usr/lib/python3/dist-packages/middlewared/schema.py", line 1269, in nf
    return await func(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/middlewared/schema.py", line 1137, in nf
    res = await f(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/middlewared/plugins/vm/vm_lifecycle.py", line 42, in start
    await self.middleware.run_in_thread(self._start, vm['name'])
  File "/usr/lib/python3/dist-packages/middlewared/utils/run_in_thread.py", line 10, in run_in_thread
    return await self.loop.run_in_executor(self.run_in_thread_executor, functools.partial(method, *args, **kwargs))
  File "/usr/lib/python3.9/concurrent/futures/thread.py", line 52, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/lib/python3/dist-packages/middlewared/plugins/vm/vm_supervisor.py", line 62, in _start
    self.vms[vm_name].start(vm_data=self._vm_from_name(vm_name))
  File "/usr/lib/python3/dist-packages/middlewared/plugins/vm/supervisor/supervisor_base.py", line 122, in start
    if self.domain.isActive():
  File "/usr/lib/python3/dist-packages/libvirt.py", line 1709, in isActive
    raise libvirtError('virDomainIsActive() failed')
libvirt.libvirtError: internal error: client socket is closed

Looking through the log journalctl | grep libvirt I found one instance of

Code:

libvirt: XML-RPC error : Cannot write data: Broken pipe

followed by a large amount of:

Code:

libvirt: XML-RPC error : internal error: client socket is closed
libvirt: XML-RPC error : internal error: client socket is closed
libvirt: XML-RPC error : internal error: client socket is closed
libvirt: XML-RPC error : internal error: client socket is closed
libvirt: XML-RPC error : internal error: client socket is closed
libvirt: XML-RPC error : internal error: client socket is closed
libvirt: XML-RPC error : internal error: client socket is closed
libvirt: XML-RPC error : internal error: client socket is closed
libvirt: XML-RPC error : internal error: client socket is closed

I checked the source for middlewared on GitHub and extracted the URL used for virsh: «qemu+unix:///system?socket=/run/truenas_libvirt/libvirt-sock» so I logged in using SSH and tried virsh -c "qemu+unix:///system?socket=/run/truenas_libvirt/libvirt-sock" sysinfo and it worked:

Code:

<sysinfo type='smbios'>
  <bios>
    <entry name='vendor'>American Megatrends Inc.</entry>
    <entry name='version'>3.5</entry>
    <entry name='date'>05/18/2021</entry>
    <entry name='release'>5.14</entry>
  </bios>
  <system>
    <entry name='manufacturer'>SuperMicro</entry>
    <entry name='product'>Super Server</entry>
    <entry name='version'>0123456789</entry>
    <entry name='serial'>0123456789</entry>
    <entry name='uuid'>00000000-0000-0000-0000-3cecef075479</entry>
    <entry name='sku'>To be filled by O.E.M.</entry>
    <entry name='family'>To be filled by O.E.M.</entry>
  </system>
  ....

so the socket works fine if called manually, but Middlewared gets an error.

I haven’t done any config CLI but everything through the GUI.

Have anyone seen anything similar?
Have I missed something?

//Many thanks

  • debug-nas-20220125091649.tgz

    9.9 MB · Views: 72

  • #2

I am seeing more/less the same error when I manually start a VM. This started shortly after a failed attempt to install Linux Mint and/or Debian onto a VM. I have since forgotten the exact sequence.. But I remember one of the two VMs reached a state of no response so I force killed one of the VMs and since then VMs will not start. I have not tried rebooting TrueNAS yet.

  • #3

I am also seeing this (albeit on TrueNAS CORE). The first time it happened was last night when I was working my way through the solution recommended to me in my most recent post: first I had shut down my VM, then I issued # ifconfig bridge0 destroy on the host, then I manually created a bridge in the WebGUI, then I tried to start up the VM, then I saw the error, then I rebooted, recreated the bridge, and started the VM back up and the error did not present itself again.

Just now, one of my VMs locked up so I powered it off and tried to restart it and saw the error once again. As before, I rebooted with the VM autostarted and the error does not resurface.

  • #4

I’m having similar issue with my VM, just started within past 24 hours.

I recently upgraded to U8 (yesterday 2/19) and now VM stability issue (rock solid running under U7). VM will now crash and become unresponsive after a few hours. When attempting to restart it via UI it will hang then report «libvirt.libvirtError: internal error: client socket is closed» when trying to start up again. I have to reboot Truenas itself to resolve but when rebooting Truenas the shutdown process hangs «some process could not die: ps axl advised» and needs a power cycle.

  • #5

A quick update on my issue. Everything started working again after I updated to 22.02. My guess is that NAS-114087 fixed the issue.

  • #6

A quick update on my issue. Everything started working again after I updated to 22.02. My guess is that NAS-114087 fixed the issue.

nop — I’m on
Version:
TrueNAS-SCALE-22.02.0.1

Would be nice to get an idea whta is going on in code line 121…domain? I hope my domain is active :smile:), the developer not..

line 121, in start if self.domain.isActive(): File «/usr/lib/python3/dist-packages/libvirt.py», line 1709, in isActive raise libvirtError(‘virDomainIsActive() failed’) libvirt.libvirtError: internal error: client socket is close

  • #7

nop — I’m on
Version:
TrueNAS-SCALE-22.02.0.1

Would be nice to get an idea whta is going on in code line 121…domain? I hope my domain is active :smile:), the developer not..

line 121, in start if self.domain.isActive(): File «/usr/lib/python3/dist-packages/libvirt.py», line 1709, in isActive raise libvirtError(‘virDomainIsActive() failed’) libvirt.libvirtError: internal error: client socket is close

maybe because TN is Resilvering — who knows, not me
..

  • #8

maybe because TN is Resilvering — who knows, not me
..

at least here, stopping Resilvering by a good old reboot solved the problem VM is up and running again.
So I assume when the box is to busy it can’t get the connection to its own brige….or so

  • #9

Hey, is anybody aware of that topic?
The issue still exists on latest version.
What kind logs are needed?

Last edited: Apr 16, 2022

  • #10

I’m having the same issue on TrueNAS-12.0-U8.1

  • #11

Version: TrueNAS-SCALE-22.02.1

I’m seeing this in my logs:
May 27 13:18:43 truenas middlewared[917]: libvirt: XML-RPC error : internal error: client socket is closed
May 27 13:18:43 truenas middlewared[917]: libvirt: XML-RPC error : internal error: client socket is closed

I have a VM that’s running, but the VM state is «ERROR». I’m not sure what the problem is.

[COLOR=var(—fg1)] libvirtError[/COLOR]​

internal error: client socket is closed
[COLOR=var(—fg2)]Error: Traceback (most recent call last): File «/usr/lib/python3/dist-packages/middlewared/main.py», line 175, in call_method result = await self.middleware._call(message[‘method’], serviceobj, methodobj, params, app=self) File «/usr/lib/python3/dist-packages/middlewared/main.py», line 1257, in _call return await methodobj(*prepared_call.args) File «/usr/lib/python3/dist-packages/middlewared/schema.py», line 1261, in nf return await func(*args, **kwargs) File «/usr/lib/python3/dist-packages/middlewared/schema.py», line 1129, in nf res = await f(*args, **kwargs) File «/usr/lib/python3/dist-packages/middlewared/plugins/vm/vm_lifecycle.py», line 39, in start await self.middleware.run_in_thread(self._start, vm[‘name’]) File «/usr/lib/python3/dist-packages/middlewared/main.py», line 1172, in run_in_thread return await self.run_in_executor(self.thread_pool_executor, method, *args, **kwargs) File «/usr/lib/python3/dist-packages/middlewared/main.py», line 1169, in run_in_executor return await loop.run_in_executor(pool, functools.partial(method, *args, **kwargs)) File «/usr/lib/python3.9/concurrent/futures/thread.py», line 52, in run result = self.fn(*self.args, **self.kwargs) File «/usr/lib/python3/dist-packages/middlewared/plugins/vm/vm_supervisor.py», line 62, in _start self.vms[vm_name].start(vm_data=self._vm_from_name(vm_name)) File «/usr/lib/python3/dist-packages/middlewared/plugins/vm/supervisor/supervisor_base.py», line 121, in start if self.domain.isActive(): File «/usr/lib/python3/dist-packages/libvirt.py», line 1709, in isActive raise libvirtError(‘virDomainIsActive() failed’) libvirt.libvirtError: internal error: client socket is closed[/COLOR]

  • #12

at least here, stopping Resilvering by a good old reboot solved the problem VM is up and running again.
So I assume when the box is to busy it can’t get the connection to its own brige….or so

Rebooting seems to have «fixed» the problem, for the last day or so no repeated issue. But the fact that it comes up is worrisome.

  • #13

I have had a similar, if not exact same issue crop up twice now. It seems to occur at ~6 day intervals on my system; once today (why I’m here), and another the 27th after close to 6 days of uptime. I do not have issues logging into the VMs themselves, just interacting with them in the webUI or running virsh list —all. Restarting libvirtd doesn’t seem to do the trick. I rebooted the whole machine after safely shutting down all of the machines and all of the machines set to autostart booted up just fine and I could interact with the VM webUI again.

If it happens again (I expect it to, but nbd) I’ll try to get as much info as I can. Any info that I can grab that would help fix?

  • #14

I’m having a similar issue on TrueNAS-SCALE-22.02.1 that we just noticed last night.

One of our Windows Server 2019 VMs started exhibiting weird behavior.

Task Manager doesn’t start, returns error — The service cannot accept control messages at this time.
Services Manager doesn’t start, returns error — The service cannot accept control messages at this time.
Event Viewer doesn’t start, returns error — The service cannot accept control messages at this time.
Elevated CLI doesn’t start, returns error — The service cannot accept control messages at this time.
Shutdown/Reboot commands from the GUI do nothing, no error returned.

Checked TrueNAS GUI and the VMs (multiple) are OFF/ERROR state, despite both VMs being up, accessible, and carrying on their respective functions. I just can’t get into any of the Administrative functions.

1654086644376.png

Attempting to start them returns the libvrtError.

1654086704966.png

Looks like the only solution I have is to gracefully shutdown and reboot TrueNAS SCALE.

  • #15

I’m having a similar issue on TrueNAS-SCALE-22.02.1 that we just noticed last night.

One of our Windows Server 2019 VMs started exhibiting weird behavior.

Task Manager doesn’t start, returns error — The service cannot accept control messages at this time.
Services Manager doesn’t start, returns error — The service cannot accept control messages at this time.
Event Viewer doesn’t start, returns error — The service cannot accept control messages at this time.
Elevated CLI doesn’t start, returns error — The service cannot accept control messages at this time.
Shutdown/Reboot commands from the GUI do nothing, no error returned.

Checked TrueNAS GUI and the VMs (multiple) are OFF/ERROR state, despite both VMs being up, accessible, and carrying on their respective functions. I just can’t get into any of the Administrative functions.

View attachment 55778

Attempting to start them returns the libvrtError.

View attachment 55779

Looks like the only solution I have is to gracefully shutdown and reboot TrueNAS SCALE.

I have the exact same happening to me. VM is up and running normally despite saying it is off under state, but I cannot access it at all.
Same libvirtError as above if I try to start any of my other VM’s or the running one.

  • #16

Same issue. I have 3 TrueNAS SCALE servers in total. 2 of them are used for VMs. Both of them gives the same errors. VMs can rune fine for a couple of days, then suddenly they’re turned off and won’t start. Reboot of server «fixes» the problem — which is… «Less than desirable»… Both Windows, FreeBSD and Linux VMs.

  • #17

Same issue. I have 3 TrueNAS SCALE servers in total. 2 of them are used for VMs. Both of them gives the same errors. VMs can rune fine for a couple of days, then suddenly they’re turned off and won’t start. Reboot of server «fixes» the problem — which is… «Less than desirable»… Both Windows, FreeBSD and Linux VMs.

To clarify; the VMs are fine, up and running. It’s the TrueNAS SCALE UI that seems to have problems.

  • #18

Just had this happen again: An anecdotal observation is that I have gotten along further uptime wise without any of my windows vms live. ~18 days before erroring out. All vms are still up and functional, I just cannot manage them through webUI or virsh:

Code:

root@truenas[/]# virsh list
error: failed to connect to the hypervisor
error: Failed to connect socket to '/var/run/libvirt/libvirt-sock': No such file or directory

1655649505621.png

1655649530034.png

  • #19

Running into this issue, with the UI disconnecting from libvirt. Running «systemctl restart middlewared» seems to fix it without having to reboot. I posted an issue to the Jira tracker

  • #20

Workaround in place, in the form of a daily middlewared restart then..

I am seeing error connecting when I try to connect for conversion of xen-vm to kvm.

I am use virt-manager w/o problems.

Errors seen:

[root@localhost ~]# virt-v2v -v -x -ic "xen+ssh://10.61.0.47" "LAC-Venu" -o local -os /vm-images/
virt-v2v: libguestfs 1.28.1 (x86_64)
[   0.0] Opening the source -i libvirt -ic xen+ssh://10.61.0.47 LAC-Venu
input_libvirt_xen_ssh: source: scheme xen+ssh server 10.61.0.47
libvirt: XML-RPC error : End of file while reading data: sh: nc: command not found: Input/output error
virt-v2v: error: internal error: invalid argument: cannot open libvirt 
connection 'xen+ssh://10.61.0.47'

If reporting bugs, run virt-v2v with debugging enabled and include the
complete output:

virt-v2v -v -x [...]
[root@localhost ~]# 

I am seeing following errors.

Please also find the packages installed.
[root@localhost ~]# rpm -qa | grep virt*
libvirt-daemon-config-network-1.2.17-13.el7_2.5.x86_64
libgovirt-0.3.3-1.el7_2.1.x86_64
libvirt-gconfig-0.1.9-1.el7.x86_64
libvirt-daemon-driver-interface-1.2.17-13.el7_2.5.x86_64
virt-v2v-1.28.1-1.55.el7.centos.4.x86_64
libvirt-daemon-driver-storage-1.2.17-13.el7_2.5.x86_64
virt-who-0.14-9.el7_2.1.noarch
libvirt-daemon-driver-network-1.2.17-13.el7_2.5.x86_64
libvirt-1.2.17-13.el7_2.5.x86_64
fence-virt-0.3.2-2.el7.x86_64
redland-virtuoso-1.0.16-6.el7.x86_64
libvirt-python-1.2.17-2.el7.x86_64
libvirt-daemon-driver-nodedev-1.2.17-13.el7_2.5.x86_64
virt-what-1.13-6.el7.x86_64
virtuoso-opensource-6.1.6-6.el7.x86_64
libvirt-glib-0.1.9-1.el7.x86_64
libvirt-daemon-driver-qemu-1.2.17-13.el7_2.5.x86_64
libvirt-daemon-1.2.17-13.el7_2.5.x86_64
libvirt-daemon-config-nwfilter-1.2.17-13.el7_2.5.x86_64
virt-top-1.0.8-8.el7.x86_64
virt-manager-1.2.1-8.el7.noarch
libvirt-daemon-driver-secret-1.2.17-13.el7_2.5.x86_64
libvirt-daemon-driver-nwfilter-1.2.17-13.el7_2.5.x86_64
libvirt-daemon-kvm-1.2.17-13.el7_2.5.x86_64
virt-viewer-2.0-6.el7.x86_64
libvirt-client-1.2.17-13.el7_2.5.x86_64
virt-manager-common-1.2.1-8.el7.noarch
libvirt-gobject-0.1.9-1.el7.x86_64
libvirt-daemon-driver-lxc-1.2.17-13.el7_2.5.x86_64
virt-install-1.2.1-8.el7.noarch
[root@localhost ~]# rpm -qa | grep guest*
qemu-guest-agent-2.3.0-4.el7.x86_64
libguestfs-winsupport-7.2-1.el7.x86_64
libguestfs-tools-1.28.1-1.55.el7.centos.4.noarch
libguestfs-1.28.1-1.55.el7.centos.4.x86_64
libguestfs-tools-c-1.28.1-1.55.el7.centos.4.x86_64
[root@localhost ~]# 

stambata's user avatar

stambata

1,6183 gold badges13 silver badges18 bronze badges

asked Jul 11, 2016 at 7:06

Venumadhav Josyula's user avatar

3

Libvirt is trying to start netcat on the remote system, but it is not installed.

This is shown by the error returned from the remote system:

libvirt: XML-RPC error : End of file while reading data: sh: nc: command not found: Input/output error

To fix the problem, install netcat on the remote system.

answered Feb 8, 2018 at 6:35

Michael Hampton's user avatar

Michael HamptonMichael Hampton

240k42 gold badges488 silver badges954 bronze badges

  • #1

I just installed virt-top, as recommended from a KVM guy.
When I try to run it I got the following error.

libvirt: XML-RPC error : Failed to connect socket to ‘/var/run/libvirt/libvirt-sock-ro’: No such file or directory
libvirt: VIR_ERR_SYSTEM_ERROR: VIR_FROM_RPC: Failed to connect socket to ‘/var/run/libvirt/libvirt-sock-ro’: No such file or directory

Any Help ???

/Thanks Michael.

wolfgang

wolfgang

Proxmox Retired Staff

Retired Staff

Oct 1, 2014

6,496

503

103


  • #2

Hi,

we don’t use libvirt so this will not work.
virt-top is a libvirt tool.

You can see the same on the GUI.
Node->search

  • #3

Any way to see it using af CLI tool, need to do some logging for a software idea ??

wolfgang

wolfgang

Proxmox Retired Staff

Retired Staff

Oct 1, 2014

6,496

503

103


  • #4

pvesh get /nodes/<node>/status
pvesh get /cluster/resources

I’m implementing a live migration management tool using libvirt, qemu and python. In my original setup, which includes two Debian boxes I can migrate and monitor the migration without any issues, however, in my production setup, which uses CentOS 6.4 I had to recompile both qemu and libvirt to newer versions in order to support compressed migration. The issue is that even if the migration seems to work right, the monitoring returns an error with:

libvirt: XML-RPC error : Too many job stats '19' for limit '16'
Traceback (most recent call last):
  File "./migrate_monitor_migration.py", line 27, in <module>
    remaining = vm.vm_status()
  File "/software/test/VMMigration.py", line 70, in vm_status
    return self.__update_migration_status()
  File "/software/test/VMMigration.py", line 40, in __update_migration_status
    dictionary = self.vm_job_stats()
  File "/software/test/VMMigration.py", line 37, in vm_job_stats
     return self.local_dom.jobStats()
  File "/usr/local/libvirt/lib64/python2.6/site-packages/libvirt.py", line 2045, in  jobStats
    if ret is None: raise libvirtError ('virDomainGetJobStats() failed', dom=self)
libvirt.libvirtError: Too many job stats '19' for limit '16'

As it can be observed the error is raised when accessing the jobStats function of the domain.

The strange issue is that while there is no migration in progress the monitoring correctly accesses the monitoring.

To complement the information I’m attaching part of the libvirtd.log:

2013-09-22 07:02:22.806+0000: 2652: error : qemuMonitorIO:616 : internal error: End of file from monitor
2013-09-22 07:05:34.120+0000: 2654: warning : qemuOpenVhostNet:495 : Unable to open vhost-net. Opened so far 0, requested 1
2013-09-22 07:05:34.120+0000: 2654: warning : qemuDomainObjTaint:1558 : Domain id=11 name='TESTVM' uuid=348ba295-7665-b7f2-020c-04303c5896a1 is tainted: high-privileges
2013-09-22 07:05:34.154+0000: 2654: error : virDBusCallMethod:1156 : The name org.freedesktop.machine1 was not provided by any .service files
2013-09-22 07:06:16.177+0000: 2655: warning : qemuMigrationCancelDriveMirror:1383 : Unable to stop block job on drive-virtio-disk0
2013-09-22 07:10:41.637+0000: 2653: warning : qemuMigrationCancelDriveMirror:1383 : Unable to stop block job on drive-virtio-disk0
2013-09-22 07:12:00.657+0000: 2657: warning : qemuMigrationCancelDriveMirror:1383 : Unable to stop block job on drive-virtio-disk0

Thanks in advance for any pointer.

Понравилась статья? Поделить с друзьями:

Читайте также:

  • Libva error vagetdrivernamebyindex
  • Libva error electron
  • Libusb0 dll ошибка
  • Libusb error pipe
  • Libusb error not supported

  • 0 0 голоса
    Рейтинг статьи
    Подписаться
    Уведомить о
    guest

    0 комментариев
    Старые
    Новые Популярные
    Межтекстовые Отзывы
    Посмотреть все комментарии