rdma monitor displays the IB device name and the netdevice
name when displaying event info. Since users can modiy these
names, we track and notify on renaming events.
$ rdma monitor
$ rmmod mlx5_ib
[UNREGISTER] dev 1 rocep8s0f1
[UNREGISTER] dev 0 rocep8s0f0
$ modprobe mlx5_ib
[REGISTER] dev 2 mlx5_0
[NETDEV_ATTACH] dev 2 mlx5_0 port 1 netdev 4 eth2
[REGISTER] dev 3 mlx5_1
[NETDEV_ATTACH] dev 3 mlx5_1 port 1 netdev 5 eth3
[RENAME] dev 2 rocep8s0f0
[RENAME] dev 3 rocep8s0f1
$ devlink dev eswitch set pci/0000:08:00.0 mode switchdev
[UNREGISTER] dev 2 rocep8s0f0
[REGISTER] dev 4 mlx5_0
[NETDEV_ATTACH] dev 4 mlx5_0 port 30 netdev 4 eth2
[RENAME] dev 4 rdmap8s0f0
$ echo 4 > /sys/class/net/eth2/device/sriov_numvfs
[NETDEV_ATTACH] dev 4 rdmap8s0f0 port 2 netdev 7 eth4
[NETDEV_ATTACH] dev 4 rdmap8s0f0 port 3 netdev 8 eth5
[NETDEV_ATTACH] dev 4 rdmap8s0f0 port 4 netdev 9 eth6
[NETDEV_ATTACH] dev 4 rdmap8s0f0 port 5 netdev 10 eth7
[REGISTER] dev 5 mlx5_0
[NETDEV_ATTACH] dev 5 mlx5_0 port 1 netdev 11 eth8
[REGISTER] dev 6 mlx5_1
[NETDEV_ATTACH] dev 6 mlx5_1 port 1 netdev 12 eth9
[RENAME] dev 5 rocep8s0f0v0
[RENAME] dev 6 rocep8s0f0v1
[REGISTER] dev 7 mlx5_0
[NETDEV_ATTACH] dev 7 mlx5_0 port 1 netdev 13 eth10
[RENAME] dev 7 rocep8s0f0v2
[REGISTER] dev 8 mlx5_0
[NETDEV_ATTACH] dev 8 mlx5_0 port 1 netdev 14 eth11
[RENAME] dev 8 rocep8s0f0v3
$ ip link set eth2 name myeth2
[NETDEV_RENAME] netdev 4 myeth2
$ ip link set eth1 name myeth1
** no events received, because eth1 is not attached to
an IB device **
Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Extend the "rdma sys" command to display whether RDMA
monitoring is supported.
Example output for kernel where monitoring is supported:
$ rdma sys show
netns shared privileged-qkey off monitor on copy-on-fork on
Example output for kernel where monitoring is not supported:
$ rdma sys show
netns shared privileged-qkey off monitor off copy-on-fork on
Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Introduce a new command for RDMA event monitoring.
This patch adds a new attribute "event_type" which describes
the event recieved. Add a new NETLINK_RDMA multicast group
and processes listening to this multicast group receive RDMA
events.
The event types supported are IB device registration/unregistration
and net device attachment/detachment.
Example output of rdma monitor and the commands which trigger
the events:
$ rdma monitor
$ rmmod mlx5_ib
[UNREGISTER] dev 3 rocep8s0f1
[UNREGISTER] dev 2 rocep8s0f0
$modprobe mlx5_ib
[REGISTER] dev 4 mlx5_0
[NETDEV_ATTACH] dev 4 mlx5_0 port 1 netdev 4 eth2
[REGISTER] dev 5 mlx5_1
[NETDEV_ATTACH] dev 5 mlx5_1 port 1 netdev 5 eth3
$ devlink dev eswitch set pci/0000:08:00.0 mode switchdev
[UNREGISTER] dev 4 rocep8s0f0
[REGISTER] dev 6 mlx5_0
[NETDEV_ATTACH] dev 6 mlx5_0 port 30 netdev 4 eth2
$ echo 4 > /sys/class/net/eth2/device/sriov_numvfs
[NETDEV_ATTACH] dev 6 rdmap8s0f0 port 2 netdev 7 eth4
[NETDEV_ATTACH] dev 6 rdmap8s0f0 port 3 netdev 8 eth5
[NETDEV_ATTACH] dev 6 rdmap8s0f0 port 4 netdev 9 eth6
[NETDEV_ATTACH] dev 6 rdmap8s0f0 port 5 netdev 10 eth7
[REGISTER] dev 7 mlx5_0
[NETDEV_ATTACH] dev 7 mlx5_0 port 1 netdev 11 eth8
[REGISTER] dev 8 mlx5_0
[NETDEV_ATTACH] dev 8 mlx5_0 port 1 netdev 12 eth9
[REGISTER] dev 9 mlx5_0
[NETDEV_ATTACH] dev 9 mlx5_0 port 1 netdev 13 eth10
[REGISTER] dev 10 mlx5_0
[NETDEV_ATTACH] dev 10 mlx5_0 port 1 netdev 14 eth11
$ echo 0 > /sys/class/net/eth2/device/sriov_numvfs
[UNREGISTER] dev 7 rocep8s0f0v0
[UNREGISTER] dev 8 rocep8s0f0v1
[UNREGISTER] dev 9 rocep8s0f0v2
[UNREGISTER] dev 10 rocep8s0f0v3
[NETDEV_DETACH] dev 6 rdmap8s0f0 port 2
[NETDEV_DETACH] dev 6 rdmap8s0f0 port 3
[NETDEV_DETACH] dev 6 rdmap8s0f0 port 4
[NETDEV_DETACH] dev 6 rdmap8s0f0 port 5
Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
'rdma resource show cq' supports object 'dev' but not 'link', and
doesn't support device name with port.
Fixes: b0b8e32cbf ("rdma: Add CQ resource tracking information")
Signed-off-by: wenglianfa <wenglianfa@huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
This patch adds a new device attribute "type", as well as supports to
add and delete a rdma device with a specific type. This new device
provides a subset of functionalists defined in IBTA spec.
Currently only type "SMI" is supported: A SMI device provides SMI (QP0)
interface; This device and it's parent associates with the same HCA port
and shares the physical link, so when the parent doesn't support SMI,
It allows the subnet manager to configure the link.
This patch also supports to print device type and parent if any.
Examples:
$ rdma dev add smi1 type SMI parent ibp8s0f1
$ rdma dev show smi1
2: smi1: node_type ca fw 20.38.1002 node_guid 9803:9b03:009f:d5ef sys_image_guid 9803:9b03:009f:d5ee type smi parent ibp8s0f1
$ rdma dev del smi1
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Acked-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
Update rdma_netlink.h file upto kernel commit 294424839b5e
("RDMA/nldev: Add support to dump device type and parent device if exists")
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Acked-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
Utilize the -dd flag (driver-specific details) in the rdmatool
to view driver-specific QPs which are not exposed yet.
The following examples show mlx5 UMR QP which is visible now:
$ rdma resource show qp link ibp8s0f1
link ibp8s0f1/1 lqpn 360 type UD state RTS sq-psn 0 comm [mlx5_ib]
link ibp8s0f1/1 lqpn 0 type SMI state RTS sq-psn 0 comm [ib_core]
link ibp8s0f1/1 lqpn 1 type GSI state RTS sq-psn 0 comm [ib_core]
$ rdma resource show qp link ibp8s0f1 -dd
link ibp8s0f1/1 lqpn 360 type UD state RTS sq-psn 0 comm [mlx5_ib]
link ibp8s0f1/1 lqpn 465 type DRIVER subtype REG_UMR state RTS sq-psn 0 comm [mlx5_ib]
link ibp8s0f1/1 lqpn 0 type SMI state RTS sq-psn 0 comm [ib_core]
link ibp8s0f1/1 lqpn 1 type GSI state RTS sq-psn 0 comm [ib_core]
$ rdma resource show
0: ibp8s0f0: pd 3 cq 4 qp 3 cm_id 0 mr 0 ctx 0 srq 2
1: ibp8s0f1: pd 3 cq 4 qp 3 cm_id 0 mr 0 ctx 0 srq 2
$ rdma resource show -dd
0: ibp8s0f0: pd 3 cq 4 qp 4 cm_id 0 mr 0 ctx 0 srq 2
1: ibp8s0f1: pd 3 cq 4 qp 4 cm_id 0 mr 0 ctx 0 srq 2
Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Update rdma_netlink.h file up to kernel commit e18fa0bbcedf
("RDMA/core: Add an option to display driver-specific QPs in the rdmatool")
Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
All these SPRINT_BUF(b) definitions are inside the 'if' block, but
accessed outside the 'if' block through the pointers 'comm'. This
leads to empty 'comm' attribute when querying resource information.
So move the definitions to the beginning of the functions to extend
their life cycle.
Before:
$ rdma res show srq
dev hns_0 srqn 0 type BASIC lqpn 18 pdn 5 pid 7775 comm
After:
$ rdma res show srq
dev hns_0 srqn 0 type BASIC lqpn 18 pdn 5 pid 7775 comm ib_send_bw
Fixes: 1808f002df ("lib/fs: fix memory leak in get_task_name()")
Signed-off-by: wenglianfa <wenglianfa@huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Acked-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Mixing the semantics of ending lines with the json object
leads to several bugs where json object is closed twice, etc.
Replace by breaking the meaning of newline() function into
two parts.
Now, lots of functions were taking the rdma data structure as
argument but never using it.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
With the shorter form of print_ function some of the lines can
now be shortened. Max line length in iproute2 should be 100 characters
or less.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The rdma utility should be using same code pattern as rest of
iproute2. When printing, color should only be requested when
desired; if no color wanted, use the simpler print_XXX instead.
Fixes: b0a688a542 ("rdma: Rewrite custom JSON and prints logic to use common API")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Enrich rdmatool with an option to enable or disable privileged QKEY.
When enabled, non-privileged users will be allowed to specify a
controlled QKEY.
By default this parameter is disabled in order to comply with IB spec.
According to the IB specification rel-1.6, section 3.5.3:
"QKEYs with the most significant bit set are considered controlled
QKEYs, and a HCA does not allow a consumer to arbitrarily specify a
controlled QKEY."
This allows old applications which existed before the kernel commit:
0cadb4db79e1 ("RDMA/uverbs: Restrict usage of privileged QKEYs")
they can use privileged QKEYs without being a privileged user to now
be able to work again without being privileged granted they turn on this
parameter.
rdma tool command examples and output.
$ rdma system show
netns shared privileged-qkey off copy-on-fork on
$ rdma system set privileged-qkey on
$ rdma system show
netns shared privileged-qkey on copy-on-fork on
Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Reviewed-by: Michael Guralnik <michaelgur@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Add support to dump SRQ resource in raw format.
This patch relies on the corresponding kernel commit aebf8145e11a
("RDMA/core: Add support to dump SRQ resource in RAW format")
Example:
$ rdma res show srq -r
dev hns3 149000...
$ rdma res show srq -j -r
[{"ifindex":0,"ifname":"hns3","data":[149,0,0,...]}]
Signed-off-by: wenglianfa <wenglianfa@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Replace multiple whitespaces with tab where appropriate.
While at it, fix tc flower help message and remove some double
whitespaces.
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
These are the post-merge of netwoking user headers.
Note: this fixes compilation with gcc-12
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
dcb, devlink, rdma, tipc and vdpa rely on libmnl to compile, so they
check for libmnl to be installed on their Makefiles.
This moves HAVE_MNL check from the tools to top-level Makefile, thus
avoiding to call their Makefiles if libmnl is not present.
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
RDMA_NLDEV_ATTR_RES_PID and RDMA_NLDEV_ATTR_RES_KERN_NAME cannot be set
together, as evident for the fill_res_name_pid() function in the kernel
infiniband driver. This commit makes this clear at first glance, using
an else branch for the RDMA_NLDEV_ATTR_RES_KERN_NAME case.
This also helps coverity to better understand this code and avoid
producing a bogus warning complaining about mnl_attr_get_str overwriting
comme, and thus leaking the storage that comm points to.
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
asprintf() allocates memory which is not freed on the error path of
get_task_name(), thus potentially leading to memory leaks.
%m specifier on fscanf allocates memory, too, which needs to be freed by
the caller.
This reworks get_task_name() to avoid memory allocation.
- Pass a buffer and its length to the function, similarly to what
get_command_name() does, thus avoiding to allocate memory for
the string to be returned;
- Use snprintf() instead of asprintf();
- Use fgets() instead of fscanf() to limit string length.
Fixes: 81bfd01a4c ("lib: move get_task_name() from rdma")
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The addition of driver QP type with index 0xFF caused to the following
clang compilation error:
res.c:152:10: warning: result of comparison of constant 256 with expression of type 'uint8_t' (aka 'unsigned char') is always true [-Wtautological-constant-out-of-range-compare]
if (idx < ARRAY_SIZE(qp_types_str) && qp_types_str[idx])
~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~
Instead of allocating very sparse array, simply create separate check
for the driver QP type.
Fixes: 39307384ce ("rdma: Add driver QP type string")
Reported-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
The strncat() function will copy upto n bytes supplied as third
argument. The n bytes shouldn't be no more than destination and
not the source.
This change fixes the following clang compilation warnings:
res-srq.c:75:25: warning: size argument in 'strncat' call appears to be size of the source [-Wstrncat-size]
strncat(qp_str, tmp, sizeof(tmp) - 1);
^~~~~~~~~~~~~~~
res-srq.c:99:23: warning: size argument in 'strncat' call appears to be size of the source [-Wstrncat-size]
strncat(qp_str, tmp, sizeof(tmp) - 1);
^~~~~~~~~~~~~~~
res-srq.c:142:25: warning: size argument in 'strncat' call appears to be size of the source [-Wstrncat-size]
strncat(qp_str, tmp, sizeof(tmp) - 1);
^~~~~~~~~~~~~~~
Fixes: 9b272e138d ("rdma: Add SRQ resource tracking information")
Reported-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
This patch provides an extension to the rdma statistics tool
that allows to set/unset optional counters set dynamically,
using new netlink commands.
Note that the optional counter statistic implementation is
driver-specific and may impact the performance.
Examples:
To enable a set of optional counters on link rocep8s0f0/1:
$ sudo rdma statistic set link rocep8s0f0/1 optional-counters cc_rx_ce_pkts,cc_rx_cnp_pkts
To disable all optional counters on link rocep8s0f0/1:
$ sudo rdma statistic unset link rocep8s0f0/1 optional-counters
Signed-off-by: Neta Ostrovsky <netao@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
This patch introduces the "mode" command, which presents the enabled or
supported (when the "supported" argument is available) optional
counters.
An optional counter is a vendor-specific counter that may be
dynamically enabled/disabled. This enhancement of hwcounters allows
exposing of counters which are for example mutual exclusive and cannot
be enabled at the same time, counters that might degrades performance,
optional debug counters, etc.
Examples:
To present currently enabled optional counters on link rocep8s0f0/1:
$ rdma statistic mode link rocep8s0f0/1
link rocep8s0f0/1 optional-counters cc_rx_ce_pkts
To present supported optional counters on link rocep8s0f0/1:
$ rdma statistic mode supported link rocep8s0f0/1
link rocep8s0f0/1 supported optional-counters cc_rx_ce_pkts,cc_rx_cnp_pkts,cc_tx_cnp_pkts
Signed-off-by: Neta Ostrovsky <netao@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Update rdma_netlink.h file upto kernel commit 7301d0a9834c
("RDMA/nldev: Add support to get status of all counters")
Signed-off-by: Neta Ostrovsky <netao@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>